# GCP VPC Flow Logs via Pub/Sub Setup

Netography Fusion ingests VPC flow logs from Google Cloud Platform (GCP) via a GCP Pub/Sub subscription. The steps to integrate with GCP are:

1. Enable VPC flow logs
2. Create a Pub/Sub topic
3. Create a Cloud Logging Sink Pub/Sub for the topic
4. Create a Pub/Sub Pull Subscription to the topic
5. Add Netography's GCP service account as a principal for the Pub/Sub subscription
6. In Fusion, Add GCP as a new flow source.

{% hint style="success" %}
**👍You can onboard an entire GCP organization or folder by following these steps one time**

You only need to create 1 GCP Pub/Sub topic, 1 GCP Cloud aggregated Logging Sink, 1 GCP Pub/Sub Subscription, and 1 Fusion GCP flow source to onboard GCP VPC flow logs to Fusion for as many VPC, subnets, projects, and sub-folders you have in your GCP organization or in a single folder in your GCP organization. If you need more granular control over what enabled VPC flow logs should be routed to Netography, you can create 1 GCP Pub/Sub topic, 1 GCP Pub/Sub Subscription, 1 Fusion GCP flow source, and as many Cloud Logging Sinks as you need all routed to the one topic.

Additional information on using a aggregated logging sink and its benefits and limitations are described in step 3 below.
{% endhint %}

In addition to ingesting VPC flow logs, you may want to enrich them with context from GCP resources by adding the [GCP Context Integration](https://docs.netography.com/enrich-traffic-with-context/configure-context-integrations/gcp)

{% hint style="info" %}
**🤖Using Terraform to automate onboarding**

Access Netography's Terraform automation at our GitHub repo: <https://github.com/netography/neto-onboarding>. For access to the repo, email <support@netography.com> with your GitHub ID or with a request for access to the latest release package.

Netography provides a Terraform project, `neto-onboarding,` that provides Netography Fusion Cloud Onboarding Automation for AWS Organizations, Azure Tenants, and GCP Organizations.

This automation provides the following capabilties, which you can use in whole or part:

* Enables and configure AWS VPC flow logs, Azure VNet flow logs, and GCP VPC flow logs based on a simple policy and tags that defines which VPC/VNet are in scope.
* Deploy all the infrastructure required to integrate to Fusion across multiple accounts (AWS), subscriptions (Azure), and projects (GCP) in a single deployment
* Adds VPCs/VNets configured for flow logging to Netography Fusion as traffic sources.
* Deploys a single AWS Lambda function, Azure Function, or Google Function that provides context enrichment across all the accounts/subscriptions/projects as an outbound push from your cloud to the Fusion API, eliminating the need to add context integrations from the Fusion portal, to grant Netography permissions to directly enumerate resource properties, or to add individual context integrations in Fusion for each cloud account.
* Monitor for VPC/VNet changes and trigger enabling and configuring flow logs, and onboarding to Fusion new VPCs/VNets that are in scope, and offboarding VPCs/VNets that are removed or no longer in scope.
  {% endhint %}

## Prerequisites <a href="#prerequisites" id="prerequisites"></a>

* If you have GCP organization policy constraints in place, you may be unable to perform these steps until you update the organizational policies. If you receive an error referring to an organization policy, update the policy and retry. Updating an organization policy requires the *Organization Policy Administrator* role ( `roles/orgpolicy.policyAdmin`).
* You need sufficient permissions in GCP to perform each step. The GCP documentation referenced in each step details the roles and permissions associated with that action.

## GCP Setup <a href="#gcp-setup" id="gcp-setup"></a>

### 1. Enable VPC flow logs <a href="#id-1-enable-vpc-flow-logs" id="id-1-enable-vpc-flow-logs"></a>

You can skip this step if you already have VPC flow logs enabled for the networks to monitor.

Follow these steps using the configuration settings below: [GCP: Enable VPC Flow Logs when you create a subnet](https://cloud.google.com/vpc/docs/using-flow-logs#network-management)

Additional instructions for enabling GCP VPC flow logs are available at [GCP: Use VPC Flow Logs](https://cloud.google.com/vpc/docs/using-flow-logs).

You can create filters in GCP to limit what traffic flow logs are generated for if you do not want to generate flow logs for all traffic. To only include traffic that is external to a VPC, use the filter expression `'!(has(src_vpc.vpc_name) && has(dest_vpc.vpc_name))'`.

#### Flow Log Configuration <a href="#flow-log-configuration" id="flow-log-configuration"></a>

| Field                  | Value      |
| ---------------------- | ---------- |
| `Aggregation Interval` | `1 minute` |
| `Sample Rate`          | `100`      |
| `Include Metadata`     | `Yes`      |

#### Option 1. Enabling VPC Flow Logs at the Subnet Level <a href="#option-1-enabling-vpc-flow-logs-at-the-subnet-level" id="option-1-enabling-vpc-flow-logs-at-the-subnet-level"></a>

1. On the **Subnets in current project** tab, select one or more subnets and then click **Manage flow logs**.
2. In **Manage flow logs**, click **Add new configuration.** This will configure a new VPC flow log configuration.
3. Do one of the following:
   1. If you selected one subnet, in the **Configurations — Subnets** section, click **Add a configuration**.
   2. If you selected multiple subnets, in the **Configure VPC Flow Logs** section, select **Network Management API**.
4. For **Name**, enter a name for the new VPC Flow Logs configuration.
5. Change the **Aggregation Interval** to `1 minute`.
6. Optional: Adjust the **Description** and any of the settings in the **Advanced settings** section:
   1. **Log filtering**: By default, **Keep only logs that match a filter** is deselected.
   2. **Include metadata in the final log entries**: By default, **Metadata annotations** includes all fields.
   3. **Secondary sampling rate**: `100%` means that all entries generated by the primary flow log sampling process are kept.
7. Click **Save**.

#### Option 2. Enabling VPC Flow Logs for VPC Networks <a href="#option-2-enabling-vpc-flow-logs-for-vpc-networks" id="option-2-enabling-vpc-flow-logs-for-vpc-networks"></a>

1. On the **Networks in current project** tab, select one or more networks and then click **Manage flow logs**.
2. In **Manage flow logs**, click **Add new configuration.** This will configure a new VPC flow log configuration.
3. In the popup window, under **Configurations - VPC networks** click on **Add a configuration**.
4. For **Name**, enter a name for the new VPC Flow Logs configuration.
5. Change the **Aggregation Interval** to `1 minute`.
6. Optional: Adjust the **Description** and any of the settings in the **Advanced settings** section:
   1. **Log filtering**: By default, **Keep only logs that match a filter** is deselected.
   2. **Include metadata in the final log entries**: By default, **Metadata annotations** includes all fields.
   3. **Secondary sampling rate**: `100%` means that all entries generated by the primary flow log sampling process are kept.
7. Click **Save**.

#### Option 3. Configuring VPC Flow Logs at the Organization Level <a href="#option-3-configuring-vpc-flow-logs-at-the-organization-level" id="option-3-configuring-vpc-flow-logs-at-the-organization-level"></a>

Configurations created at an organizational level will apply to all VPCs within that organization.

1. Navigate to the [VPC Flow Logs](https://console.cloud.google.com/networking/vpc-flow-logs) configuration page.
2. Click **Add VPC Flow Logs configuration** and then click **Add a configuration for the organization**.
3. For **Name**, enter a name for the new VPC Flow Logs configuration.
4. Change the **Aggregation Interval** to `1 minute`.
5. Optional: Adjust the **Description** and any of the settings in the **Advanced settings** section:
6. Optional: Adjust the **Description** and any of the settings in the **Advanced settings** section:
   1. **Log filtering**: By default, **Keep only logs that match a filter** is deselected.
   2. **Include metadata in the final log entries**: By default, **Metadata annotations** includes all fields.
   3. **Secondary sampling rate**: `100%` means that all entries generated by the primary flow log sampling process are kept.
7. Click **Save**.

### 2. Create a Cloud Pub/Sub topic <a href="#id-2-create-a-cloud-pubsub-topic" id="id-2-create-a-cloud-pubsub-topic"></a>

Create a Cloud Pub/Sub topic to publish flow logs to. If you are onboarding an individual GCP project, you can create the topic as part of creating the sink in step 3. If you are onboarding multiple projects at an organization or folder level, you can create a single topic in a designated project that you will use for centralized logging resources, and then use this one topic as the destination for a single aggregated sink, multiple individual project Cloud Logging Sinks, or a combination of the two.

To separately create the topic, follow these steps using the configuration settings below: [GCP: Create a Topic](https://cloud.google.com/pubsub/docs/create-topic#create_a_topic_2)

#### Pub/Sub Topic Configuration <a href="#pubsub-topic-configuration" id="pubsub-topic-configuration"></a>

| Field                        | Value                                          |
| ---------------------------- | ---------------------------------------------- |
| `Topic ID`                   | Any value ( e.g. `neto-flowlogs-pubsub-topic`) |
| `Add a default subscription` | `No`                                           |
| `Use a schema`               | `No`                                           |
| `Enable ingestion`           | `No`                                           |
| `Enable message retention`   | `Yes`- `1 Day`                                 |

Note: GCP charges for unacknowledged message retention over 1 day. In most circumstances, the messages will be acknowledged and removed from the topic in real-time, but retention will ensure there is no data lost unless the logs are not read in that time period. You can adjust the retention period based on your organization's requirements.

#### GCP Console Steps <a href="#gcp-console-steps" id="gcp-console-steps"></a>

1. Go to the **Pub/Sub Topics** page in the Google Cloud console.
2. Click **Create Topic**.
3. Fill out the form using the above configuration values, then click **Save**.

### 3. Create a Cloud Logging Sink Pub/Sub <a href="#id-3-create-a-cloud-logging-sink-pubsub" id="id-3-create-a-cloud-logging-sink-pubsub"></a>

Create a Cloud Logging Sink with a destination of Cloud Pub/Sub topic, using the topic you created in step 2 or creating the topic in the process.

{% hint style="info" %}
**ℹ️Using an aggregated sink for onboarding all projects in a GCP organization or folder**

If you are onboarding all the projects in a GCP organization, or all the projects that are children of a folder, you can use an aggregated sink to simplify the deployment. Using an aggregated sink lets you create 1 sink for a GCP organization or folder rather than 1 sink per project.

When you create an aggregated sink following these steps, all flow logs that are enabled in all child projects (including nested folders) will be routed to the aggregated sink. This will include any new projects, VPCs, or subnetworks that get added as children, and any new flow logs that are enabled will be automatically included.

An aggregated sink at the organization or folder level is ideal if you want to onboard all enabled flow logs within an organization or folder. If you have multiple folders to onboard (that are not nested within each other), you can create 1 aggregated sink for each folder, and route each of those sinks to the same Pub/Sub topic.

**Choosing the right design pattern for GCP logging sinks**

There is not one design for GCP logging sinks that is right for all organizations. Reach out to Netography Support if you would like further guidance in this area. We would be happy to setup a design session to discuss your specific organization's use case and requirements and determine the best approach together, or review a proposed design before you implement it.

**Using exclusion filters to exclude project(s) or subnetwork(s)**

If you want to include all the enabled flow logs by default, but exclude specific projects or subnetworks (or any other criteria you can write a filter for in GCP), you can add up to 50 exclusion filters to a sink (and each filter can be 20k characters with logical operators).

To exclude a project: `logName:projects/PROJECT_ID`

To exclude a subnetwork: `resource.labels.subnetwork_name"=SUBNET_NAME"`

For more filter examples, see [GCP Logging > Sample queries](https://cloud.google.com/logging/docs/view/query-library).

**Excluding newly enabled flow logs**

If you are in an organization that uses VPC flow logs for multiple use cases, such as application troubleshooting directly in GCP, you may face the circumstance where an application team needs to enable VPC flow logs for a set of subnetworks that will generate a very high volume of logs, but you do not want to onboard these flow logs to Fusion. The default behavior of a sink is to include all these logs by default.

In this case, you will want to ensure any GCP administrator

that can enable flow logs is aware of what folders have sinks that will capture these logs by default, and that an exclusion filter needs to be added to the sink **BEFORE** these flow logs are enabled to avoid creating an undesired spike in flow log volume.

**Manually including newly enabled flow logs**

If you are onboarding a limited scope of projects or subnetworks to Fusion and want to maintain tighter control over what flow logs are onboarded, so that another GCP administrator can not inadvertantly start sending flow logs to Fusion by enabling flow logs in a subnetwork they control, you may want to instead use a sink design that only includes the enabled flow logs that you specify and not any onboard any newly enabled flow logs by default.

In this case, instead of using an exclusion filter for the sink, you can use the **Inclusion Filter** to specify only the specific projects, subnetworks, or other criteria you want to use. The same filters shown for exclusions above can be used in the inclusion filter to include only those matching the filter.

You can end up with a very long inclusion filter if you are individually including each project or subnetwork by name, and if you attempt to for hundreds of criteria you will reach the 20K character limit for a filter and no longer be able to add more. Use this pattern in cases where the filter you will write and its length will not approach that limit or too complex to manage.

**Additional steps when creating an aggregated sink**

To use an aggregated sink, you will need `Owner` access to the sink's destination, and to perform the following steps when creating the sink:

1. Select the organization or folder to onboard in the GCP project picker.
2. When creating the sink, select `Include logs ingested by this folder and all child resources`in the section **Choose logs to include in sink** (this option will not appear if you selected a project).
3. Add the sink's **writer identity** as a principal by using IAM, and then grant it the Pub/Sub Publisher role ( `roles/pubsub.publisher`). See [GCP: Route logs to supported destinations > Set destination permissions](https://cloud.google.com/logging/docs/export/configure_export_v2#dest-auth). *This step may not be required in your organization.*

For more information on aggregated sink configuration, see [GCP: Collate and route organization- and folder-level logs to supported destinations](https://cloud.google.com/logging/docs/export/aggregated_sinks#create_an_aggregated_sink)
{% endhint %}

Follow these steps using the configuration settings below: [GCP: Create a sink](https://cloud.google.com/logging/docs/export/configure_export_v2#creating_sink).

#### Cloud Logging Sink Configuration <a href="#cloud-logging-sink-configuration" id="cloud-logging-sink-configuration"></a>

| Field                                  | Value                                                                           |
| -------------------------------------- | ------------------------------------------------------------------------------- |
| `Sink name`                            | Any value ( e.g. `neto-flowlogs-sink`)                                          |
| `Sink description`                     | Any value (e.g. Netography Fusion flow log ingest)                              |
| `Sink destination service type`        | `Cloud Pub/Sub topic`                                                           |
| `Sink destination Cloud Pub/Sub topic` | Create a topic or use topic created in previous step                            |
| `Inclusion filter`                     | `resource.type="gce_subnetwork" AND log_id("compute.googleapis.com/vpc_flows")` |
| Enable message retention               | `Yes`- `1 Day`                                                                  |

**Inclusion Filter**

The inclusion filter`resource.type="gce_subnetwork"` will include all VPC flow logs in the sink. You can add filters using inclusion or exclusion based on your desired configuration. For example, to only publish to the sink VPC flow logs that are ingress/exgress a VPC (excluding internal intra-VPC traffic), the inclusion filter would be:

`resource.type="gce_subnetwork"and NOT ( jsonPayload.src_vpc.vpc_name:_ AND jsonPayload.dest_vpc.vpc_name:_ )`

Adding this filter at the sink will still generate the VPC flow logs for intra-VPC traffic but will not deliver those logs to Fusion (this may be useful if you are using intra-VPC flow logs for other purposes). To filter which VPC flow logs are generated, set the filter in the VPC flow log configuration instead of at the sink (see [GCP: Filtering VPC flow logs](https://cloud.google.com/vpc/docs/flow-logs#filtering)).

#### GCP Console Steps <a href="#gcp-console-steps-1" id="gcp-console-steps-1"></a>

1. Go to the **Log Router** page in the Google Cloud console.
2. Select the project (or folder or organization if using an aggregated sink) to create the sink in.
3. Click **Create sink**.
4. Fill out the form using the above configuration values, then click **Save**

### 4. Create a Pub/Sub Pull Subscription to the topic <a href="#id-4-create-a-pubsub-pull-subscription-to-the-topic" id="id-4-create-a-pubsub-pull-subscription-to-the-topic"></a>

Follow these steps using the configuration settings below: [GCP: Create a pull subscription](https://cloud.google.com/pubsub/docs/create-subscription#create_a_pull_subscription).

#### Pub/Sub Subscription Configuration <a href="#pubsub-subscription-configuration" id="pubsub-subscription-configuration"></a>

| Field                        | Value                                                                |
| ---------------------------- | -------------------------------------------------------------------- |
| `Subscription ID`            | Any value ( e.g. `neto-flowlogs-sub`)                                |
| `Cloud Pub/Sub Topic`        | `Topic ID` from previous steps (if creating from Subscriptions page) |
| `Delivery Type`              | `Pull`                                                               |
| `Message retention duration` | `1 Day` *(or based on your requirements)*                            |
| `Retry policy`               | `Retry after exponential backoff delay` (Default min/max values)     |

Default values for all other fields can be used.

#### GCP Console Steps <a href="#gcp-console-steps-2" id="gcp-console-steps-2"></a>

1. Go to the **Topics** page in the Google Cloud console.
2. Click **⋮** next to the topic you created in previous step.
3. From the context menu, select **Create Subscription**.
4. Fill out the form using the above configuration values, then click **Save**.

Note: Alternatively, you can create a subscription from the **Subscriptions** page by entering the `Topic ID` from the previous step.

### 5. Add Netography's GCP service account as a principal to the Pub/Sub subscription <a href="#id-5-add-netographys-gcp-service-account-as-a-principal-to-the-pubsub-subscription" id="id-5-add-netographys-gcp-service-account-as-a-principal-to-the-pubsub-subscription"></a>

To grant Netography access to read logs from the Pub/Sub subscription, add the Netography GCP service account as a new principal in the subscription.

{% hint style="info" %}
**📘If you have a Domain Restricted Sharing Organizational Policy**

If your GCP organization has an Organizational Policy constraint for Domain Restricted Sharing `constraints/iam.allowedPolicyMemberDomains`, you must add a rule to that pollicy to allow Netography's GCP customer ID `C04ddcbu8`before adding the principal to the Pub/Sub subscription.

**This constraint is the default setting for all GCP organizations created on or after May 3, 2024.**

If this policy restriction exists and you do not add the rule, you will receive the following error when you save the Pub/Sub Subscription:\
**IAM policy update failed** - The ‘Domain Restricted Sharing’ organization policy (`constraints/iam.allowedPolicyMemberDomains`) is enforced.

For detailed instructions and options for configuration, see [GCP: Restricting Domains](https://cloud.google.com/resource-manager/docs/organization-policy/restrictng-domains)

**Domain Restricted Sharing Configuration**
{% endhint %}

| Field          | Value                                         |
| -------------- | --------------------------------------------- |
| `Policy Value` | `sa-cloud@netography.iam.gserviceaccount.com` |
| `Policy Type`  | `Pub/Sub Subscriber`                          |
| `Custom Value` | `C04ddcbu8`                                   |

{% hint style="info" %}
**GCP Console Steps**

To update your Organizational Policy to allow you to grant Netography's GCP service account access to the Pub/Sub subscription:

1. Go to the **Organization Policies** page in the Google Cloud console **IAM & Admin** section.
2. Next to where it says **Filter** above the list of policies, type **Domain restricted sharing**.
3. You should see 1 policy with that name in the list, with ID `constraints/iam.allowedPolicyMemberDomains`. Click **⋮** and Edit Policy.
4. Add a new rule (or add a value to an existing rule) for the policy under Rules using the above configuration values, then click **Set Policy**.
   {% endhint %}

Follow these steps to add a principal to the subscription: [GCP: Access Control for Pub/Sub > Controlling access through the Google Cloud Console](https://cloud.google.com/pubsub/docs/access-control#console)

#### Pub/Sub Subscription Principal Configuration <a href="#pubsub-subscription-principal-configuration" id="pubsub-subscription-principal-configuration"></a>

| Field       | Value                                         |
| ----------- | --------------------------------------------- |
| `Principal` | `sa-cloud@netography.iam.gserviceaccount.com` |
| `Role`      | `Pub/Sub Subscribe`                           |

#### GCP Console Steps <a href="#gcp-console-steps-4" id="gcp-console-steps-4"></a>

1. Go to the **Subscriptions** page in the Google Cloud console in the **Pub/Sub** section.
2. Select the subscription you created in the previous step to bring up the subscription info panel on right.
3. Select **Add Principal** in the info panel for the subscription.
4. Fill out the form using the above configuration values, then click **Save**.

## Netography Fusion Setup <a href="#netography-fusion-setup" id="netography-fusion-setup"></a>

### 6. Add a new GCP flow source to Fusion <a href="#id-6-add-a-new-gcp-flow-source-to-fusion" id="id-6-add-a-new-gcp-flow-source-to-fusion"></a>

In the Fusion portal, click the gear icon to go to Settings, navigate to **Traffic Sources**, click **Add Traffic Source**, select **GCP**, and fill out the form using the configuration below.

#### GCP Flow Source Configuration <a href="#gcp-flow-source-configuration" id="gcp-flow-source-configuration"></a>

The following fields are specific to the GCP configuration.

| Field               | Required | Description                                        |
| ------------------- | -------- | -------------------------------------------------- |
| `Project ID`        | yes      | GCP Project ID containing the Pub/Sub subscription |
| `Subscription ID`   | yes      | GCP Pub/Sub Subscription ID                        |
| `Sample Percentage` | yes      | GCP Flow Log Sampling Percentage                   |

The `Sample Percentage` field is only used for display purposes and does not need to be set to the same value as the value configured in GCP for the flow logs. If you are using multiple sampling percentages, you can use `100` as the value for this field.
