Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to Google Analytics 4 (GA4) #35299

Closed
23 tasks done
chalin opened this issue Jul 23, 2022 · 21 comments
Closed
23 tasks done

Migrate to Google Analytics 4 (GA4) #35299

chalin opened this issue Jul 23, 2022 · 21 comments
Assignees
Labels
area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@chalin
Copy link
Contributor

chalin commented Jul 23, 2022

This is part of a CNCF-wide effort to upgrade project websites to GA4 since Google has deprecated Universal Analytics (UA). For more details, see:

Tasks, stage 1 (no code changes are necessary)

  • Create a GA4 site tag using the GA4 setup assistant
    from the "Kubernetes - UA" (UA-36037335-10) analytics console.
    • The new GA4 measurement ID is G-JPP6RFM2BP. User access is the same as for the original UA site tag.
  • Manually add the GA4 site tag to the list of UA-36037335-10 Connected Site Tags under Admin > Tracking Info > Tracking Code
  • Confirm that the GA4 site tag is receiving events (for a screenshot, see below).
  • Confirm that the UA site tag, UA-36037335-10, is still receiving events (for a screenshot, see below).

Tasks, stage 2 (optional, can be done later)

  • Manually add UA-36037335-10 as a connected site-tag of the GA4 tag
  • Switch to using GA4 ID:
    • Netlify config - add environment variable (done by @onlydole, thanks!)
    • Google analytics: use GA4 site tag #36010
    • Wait for this to be deployed to the production server.
    • Confirm that the GA4 site tag is receiving events
    • Confirm that the UA site tag, UA-36037335-10, is still receiving events
  • Enable GA for existing release-branch deploys
    • Confirm that the GA4 site tag is receiving events
    • Confirm that the UA site tag, UA-36037335-10, is still receiving events
  • Remove the connection of the GA4 tag from the UA config and ensure events are still being processed:
    • Remove the GA4 tag from the UA config: Tracking Info > Tracking Code > Connected Site Tags section. By doing so, all non-kubernetes.io sites using the UA ID will stop forwarding events to the GA4 ID

Release-branches and page-feedback

Followup tasks

Related

/cc @nate-double-u @caniszczyk


Current website analytics info:

  • Analytics snippet is added via website/layouts/partials/head.html -- for the relevant excerpt, see:
    {{- if in (slice "production" "nonprod") hugo.Environment -}}
    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=UA-36037335-10"></script>
    <script>
    window.dataLayer = window.dataLayer || [];
    function gtag(){dataLayer.push(arguments);}
    gtag('js', new Date());
    gtag('config', 'UA-36037335-10');
    </script>
    {{- end -}}
  • Hugo version used is 0.101.0

See also:

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 23, 2022
@sftim
Copy link
Contributor

sftim commented Jul 25, 2022

/kind cleanup
/area web-development
/triage accepted

Hugo version used is 0.87.0

We use Hugo 0.101.0

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 25, 2022
@a-mccarthy
Copy link
Contributor

@chalin Thank you for creating this issue and the related resources on the migration. I'm reading through your docs at the moment, but wanted to let you know that I just checked, and I dont have permissions to add users to our current GA account. i've requested access and I would love to help you with this project if you are looking for help :)

I've worked with @jimangel in the past on getting GA access. Jim, can you help here with getting access? Or helping me get permissions to grant access to folks?

@chalin
Copy link
Contributor Author

chalin commented Jul 25, 2022

Very cool that you've (@sftim) been upgrading Hugo and Docsy (to v0.2.0 via #32906). On that topic, Docsy is now at v0.4.0, if you consider upgrading, let me know if you have any issues.

@jimangel
Copy link
Member

Sorry I missed your ping @a-mccarthy! I was chatting with @chalin, @jeefy, and @onlydole on the CNCF slack and provided Google Analytic access to a CNCF email account for completing the work. Leaving this comment for record keeping 😄

@chalin
Copy link
Contributor Author

chalin commented Aug 14, 2022

/assign

@chalin
Copy link
Contributor Author

chalin commented Aug 14, 2022

Progress update: GA4 site tag has been created from the main UA property (UA-36037335-10). User access is the same for both. The new GA4 site tag is receiving events and original UA site tag continues to process events.

GA4 screenshot:

GA4 screenshot

UA4 screenshot:

UA4 screenshot

@chalin
Copy link
Contributor Author

chalin commented Aug 16, 2022

@chalin
Copy link
Contributor Author

chalin commented Aug 26, 2022

Analytics for release-branch subdomains, July 27 - Aug 25, 2022, FYI:

image

@a-mccarthy
Copy link
Contributor

@chalin anything you need help with on this issue? Are you blocked by anything?

@chalin
Copy link
Contributor Author

chalin commented Sep 22, 2022

@chalin anything you need help with on this issue? Are you blocked by anything?

Thanks @a-mccarthy. I'm just back today, and still catching up. Will reach out with an update as soon as I can.

@chalin
Copy link
Contributor Author

chalin commented Sep 28, 2022

Analytics for release-branch subdomains from Aug 29 - Sept 27:

image

So to answer @sftim's question from #36322 (comment), yes, we'll be updating the GA site tag for release-branches from v1.25 to v1.21.

Does that sound reasonable?

@a-mccarthy
Copy link
Contributor

just wanted to note that we have a google analytics custom event for our feedback yes/no button that shows at the bottom of each page. we should make sure that this works the same in the next version (not sure if we've track this somewhere or not). More details in on the tag in: https://github.com/kubernetes/website/blob/main/layouts/partials/feedback.html

@chalin
Copy link
Contributor Author

chalin commented Oct 6, 2022

@a-mccarthy: how do you access the feedback currently?

@a-mccarthy
Copy link
Contributor

@chalin it gets log as an event, so you can view it in google analytics in the Behavior>Events section. I was just looking into this b/c i was curious, and was told of the connection between the feedback on the docs pages and google analytics.

At the moment, we don't report out this data anywhere, but we should at some point :)

Heres the code snippet for how it's captured and what we send.

gtag('event', 'click', {
        'event_category': 'Helpful',
        'event_label': window.location.pathname,
        value
      });

The value in this case is 1 for yes and 0 for no. And it gets averaged in GA for an average event value.

@a-mccarthy
Copy link
Contributor

following up about the feedback form again, it appears that we are seeing some bot data in the analytics (more details in the #37201). It appears that known bot traffic is filtered out by default in GA4, https://support.google.com/analytics/answer/9888366?hl=en. And we can filtering it out in the current GAs. Wanted to post an update here for awareness :)

@chalin
Copy link
Contributor Author

chalin commented Nov 2, 2022

... our feedback yes/no button that shows at the bottom of each page. we should make sure that this works the same in the next version

@a-mccarthy: it's not clear that it will work as is. Since page feedback-data processing might have a little less of an urgency than ensuring that GA4 events are being tracked ASAP, I've created a separate issue (and as I mention there, suggest that we tackle it as a followup item to this issue):

@chalin
Copy link
Contributor Author

chalin commented Nov 3, 2022

Update:

  • All release-branch deploys (as of the time of writing) are using the GA4 tag
  • I've removed the GA4 tag from the UA config. See the opening comment for details concerning this step. Note that the UA tag is still connected from the GA4 tag.

@chalin
Copy link
Contributor Author

chalin commented Nov 8, 2022

@sftim @nate-double-u @a-mccarthy et al. : as you might recall, #36010 added a fake GA ID to the Hugo config file with a comment letting builders know that the real GA4 ID is set via a Netlify-build environment variable.

A consequence of this approach is that, for example, when the next release-branch created, someone (the release-lead?) will need to set the GA4 ID environment variable for the new release-branch by copying it from the main site deploy settings. Or, ...

It might be possible to avoid such copying of env. vars because Netlify has a new way of managing environment variables that has a concept of scope and deploy contexts.

image

Here's the link to the docs from that image: https://docs.netlify.com/environment-variables/overview/ (search, e.g., for scope). I don't have access to the website's Netlify account so I can't explore those settings for Kubernetes. Maybe @onlydole, @a-mccarthy, @jimangel or @sftim can, and let us know what they think is possible?

Of course, we can always fall back to hardcoding the GA ID in the config file and just live with:

@chalin
Copy link
Contributor Author

chalin commented Nov 10, 2022

@a-mccarthy @sftim et al.: There are API reference pages like: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

I have a few questions about these API reference pages:

  • Are we tracking analytics for these pages? They seem to be tracked under UA, but when I look at the page HTML, I don't see any UA ID. Also, UA console says that these pages are actually being hosted by https://jamesdefabia.github.io, but that host reports a 404 for the given paths.
  • Do we want these pages to be tracked via GA4?
  • Where are these pages generated from?

@a-mccarthy
Copy link
Contributor

a-mccarthy commented Dec 15, 2022

@chalin and I talked through some open items on this project, specifically around Netlify access and selecting a way to organize properties going forward. I have a summary below or what we talked about and @chalin plans to bring this to the next sig docs meeting as well. (@chalin please chime in if i've miss summarized or missed anything)

A. Currently @chalin has been manually updating release branches with the new Google Analytics 4 ID. This is not ideal, b/c the process is so manual. This should be something we can configure in Netlify, but someone with access will need to go in and do the testing and configuring. See #37877 for more details. @chalin has offered to help with this, but needs access. In general, we've had a few requests from folks in the community to access or update things in Netlify in the past few months and we have a Github Discussion around getting the right folks access as well: #38019. This will block forward progress on this task without proper access to Netlify.

B. Based on @chalin's research the best way and (google) recommended way to set up the new google analytics assets is to create one account and have all websites tied to the account via a single ID. This means that all web entities (kubernetes.io, kubernetes.dev, the minikube etc.) that we want to track analytics under will use the same ID and be collected into the same "pool" of data in Google Analytic. Then, within GA we can create custom reports that display data for the different sites. The new google analytics has greatly reduced the amount of default reports available to users, even if we created separate organizations and ID for each, folks would likely still need to create custom reports to view data in any meaningful way.
Keeping everything in one organization and ID also means we'll have one account to administer, instead of multiple, and we can help folks create reports easier. See @chalin's issue comment for more details on this organizational structure and links to google documentation on this: #37801 (comment).

@chalin
Copy link
Contributor Author

chalin commented Mar 1, 2023

I'm going to close this issue since the main server and secondary servers have been migrated as of 2023/03/01. 🎉

There are followup items being tracked via separate issues (see the opening comment).

Branch-release leads will need to ensure that GA4 is enabled in future releases. I leave further work in your capable hands. I'll be available if y'all need assistance. For the record, here were my attempts at this:

Finally, note that as of 2024Q1 UA data will be deleted. #sig-docs-maintainers should be aware of this so that they can plan, if necessary, a strategy for backing up the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/web-development Issues or PRs related to the kubernetes.io's infrastructure, design, or build processes kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants