diff options
-rw-r--r-- | net/docs/bug-triage-labels.md | 134 | ||||
-rw-r--r-- | net/docs/bug-triage-labels.txt | 87 | ||||
-rw-r--r-- | net/docs/bug-triage-suggested-workflow.md | 230 | ||||
-rw-r--r-- | net/docs/bug-triage-suggested-workflow.txt | 184 | ||||
-rw-r--r-- | net/docs/bug-triage.md | 98 | ||||
-rw-r--r-- | net/docs/bug-triage.txt | 79 |
6 files changed, 462 insertions, 350 deletions
diff --git a/net/docs/bug-triage-labels.md b/net/docs/bug-triage-labels.md new file mode 100644 index 0000000..d6f3e2b --- /dev/null +++ b/net/docs/bug-triage-labels.md @@ -0,0 +1,134 @@ +# Chrome Network Bug Triage : Labels + +## Some network label caveats + +**Cr-UI-Browser-Downloads** +: Despite the name, this covers all issues related to downloading a file except + saving entire pages (which is **Cr-Blink-SavePage**), not just UI issues. + Most downloads bugs will have the word "download" or "save as" in the + description. Issues with the HTTP server for the Chrome binaries are not + downloads bugs. + +**Cr-UI-Browser-SafeBrowsing** +: Bugs that have to do with the process by which a URL or file is determined to + be dangerous based on our databases, or the resulting interstitials. + Determination of danger based purely on content-type or file extension + belongs in **Cr-UI-Browser-Downloads**, not SafeBrowsing. + +**Cr-Internals-Network-SSL** +: This includes issues that should be also tagged as **Cr-Security-UX** + (certificate error pages or other security interstitials, omnibox indicators + that a page is secure), and more general SSL issues. If you see requests + that die in the SSL negotiation phase, in particular, this is often the + correct label. + +**Cr-Internals-Network-DataProxy** +: Flywheel / the Data Reduction Proxy. Issues require "Reduce Data Usage" be + turned on. Proxy url is [https://proxy.googlezip.net:443](), with + [http://compress.googlezip.net:80]() as a fallback. Currently Android and + iOS only. + +**Cr-Internals-Network-Cache** +: The cache is the layer that handles most range request logic (Though range + requests may also be issued by the PDF plugin, XHRs, or other components). + +**Cr-Internals-Network-SPDY** +: Covers HTTP2 as well. + +**Cr-Internals-Network-HTTP** +: Typically not used. Unclear what it covers, and there's no specific HTTP + owner. + +**Cr-Internals-Network-Logging** +: Covers **about:net-internals**, **about:net-export** as well as the what's + sent to the NetLog. + +**Cr-Internals-Network-Connectivity** +: Issues related to switching between networks, ERR_NETWORK_CHANGED, Chrome + thinking it's online when it's not / navigator.onLine inaccuracies, etc. + +**Cr-Internals-Network-Filters** +: Covers SDCH and gzip issues. ERR_CONTENT_DECODING_FAILED indicates a problem + at this layer, and bugs here can also cause response body corruption. + +## Common non-network labels + +Bugs in these areas often receive the **Cr-Internals-Network** label, though +they fall largely outside the purview of the network stack team: + +**Cr-Blink-Forms** +: Issues submitting forms, forms having weird data, forms sending the wrong + method, etc. + +**Cr-Blink-Loader** +: Cross origin issues are sometimes loader related. Blink also has an + in-memory cache, and when it's used, requests don't appear in + about:net-internals. Requests for the same URL are also often merged there + as well. This does *not* cover issues with content/browser/loader/ files. + +**Cr-Blink-ServiceWorker** + +**Cr-Blink-Storage-AppCache** + +**Cr-Blink-WebSockets** + +**Cr-Blink-XHR** +: Generic issues with sync/async XHR requests - missing request or response + headers, multiple headers, etc. These will often run into issues in certain + corner cases (Cross origin / CORS, proxy, whatever). Attach all labels that + seem appropriate. + +**Cr-Services-Sync** +: Sharing data/tabs/history/passwords/etc between machines not working. + +**Cr-Services-Chromoting** + +**Cr-Platform-Extensions** +: Issues extensions loading / not loading / hanging. + +**Cr-Platform-Extensions-API** +: Issues with network related extension APIs should have this label. + chrome.webRequest is the big one, I believe, but there are others. + +**Cr-Internals-Plugins-Pepper[-SDK]** + +**Cr-UI-Browser-Omnibox** +: Basically any issue with the omnibox. URLs being treated as search queries + rather than navigations, dropdown results being weird, not handling certain + unicode characters, etc. If the issue is new TLDs not being recognized by + the omnibox, that's due to Chrome's TLD list being out of date, and not an + omnibox issue. Such TLD issues should be duped against + http://crbug.com/37436. + +**Cr-Internals-Media-Network** +: Issues related to media. These often run into the 6 requests per hostname + issue, and also have fun interactions with the cache, particularly in the + range request case. + +**Cr-Internals-Plugins-PDF** +: Issues loading pdf files. These are often related to range requests, which + also have some logic at the Internals-Network-Cache layer. + +**Cr-UI-Browser-Navigation** + +**Cr-UI-Browser-History** +: Issues which only appear with forward/back navigation. + +**Cr-OS-Systems-Network** / **Cr-OS-Systems-Mobile** / **Cr-OS-Systems-Bluetooth** +: These should be used for issues with ChromeOS's platform network code, and + not net/ issues on ChromeOS. + +**Cr-Blink-SecurityFeature** +: CORS / Cross origin issues. Main frame cross-origin navigation issues are + often actually **Cr-UI-Browser-Navigation** issues. + +**Cr-Privacy** +: Privacy related bug (History, cookies discoverable by an entity that + shouldn't be able to do so, incognito state being saved in memory or on disk + beyond the lifetime of incognito tabs, etc). Generally used in conjunction + with other labels. + +**Type-Bug-Security** +: Security related bug (Allows for code execution from remote site, allows + crossing security boundaries, unchecked array bounds, + etc). diff --git a/net/docs/bug-triage-labels.txt b/net/docs/bug-triage-labels.txt deleted file mode 100644 index c5d5505..0000000 --- a/net/docs/bug-triage-labels.txt +++ /dev/null @@ -1,87 +0,0 @@ -Some network label caveats -* Cr-UI-Browser-Downloads: Despite the name, this covers all issues related to - downloading a file except saving entire pages (Which is Cr-Blink-SavePage), - not just UI issues. Most downloads bugs will have the word "download" or - "save as" in the description. Issues with the HTTP server for the Chrome - binaries are not downloads bugs. -* Cr-UI-Browser-SafeBrowsing: Bugs that have to do with the process by which a - URL or file is determined to be dangerous based on our databases, or the - resulting interstitials. Determination of danger based purely on - content-type or file extension belongs in Cr-UI-Browser-Downloads, not - SafeBrowsing. -* Cr-Internals-Network-SSL: This includes issues that should be also tagged as - Cr-Security-UX (certificate error pages or other security interstitials, - omnibox indicators that a page is secure), and more general SSL issues. If - you see requests that die in the SSL negotiation phase, in particular, this - is often the correct label. -* Cr-Internals-Network-DataProxy: Flywheel / the Data Reduction Proxy. Issues - require "Reduce Data Usage" be turned on. Proxy url is - https://proxy.googlezip.net:443, with compress.googlezip.net:80 as a - fallback. Currently Android and iOS only. -* Cr-Internals-Network-Cache: The cache is the layer that handles most range - request logic (Though range requests may also be issued by the PDF plugin, - XHRs, or other components). -* Cr-Internals-Network-SPDY: Covers HTTP2 as well. -* Cr-Internals-Network-HTTP: Typically not used. Unclear what it covers, and - there's no specific HTTP owner. -* Cr-Internals-Network-Logging: Covers about:net-internals, about:net-export as - well as the what's sent to the NetLog. -* Cr-Internals-Network-Connectivity: Issues related to switching between - networks, ERR_NETWORK_CHANGED, Chrome thinking it's online when it's not / - navigator.onLine inaccuracies, etc. -* Cr-Internals-Network-Filters: Covers SDCH and gzip issues. - ERR_CONTENT_DECODING_FAILED indicates a problem at this layer, and bugs here - can also cause response body corruption. - - -Common non-network label reference. Bugs in these areas often receive the -Cr-Internals-Network label, though they fall largely outside the purview of the -network stack team: -* Cr-Blink-Forms: Issues submitting forms, forms having weird data, forms - sending the wrong method, etc. -* Cr-Blink-Loader: Cross origin issues are sometimes loader related. Blink - also has an in-memory cache, and when it's used, requests don't appear in - about:net-internals. Requests for the same URL are also often merged there - as well. This does *not* cover issues with content/browser/loader/ files. -* Cr-Blink-ServiceWorker -* Cr-Blink-Storage-AppCache -* Cr-Blink-WebSockets -* Cr-Blink-XHR: Generic issues with sync/async XHR requests - missing request - or response headers, multiple headers, etc. These will often run into - issues in certain corner cases (Cross origin / CORS, proxy, whatever). - Attach all labels that seem appropriate. -* Cr-Services-Sync: Sharing data/tabs/history/passwords/etc between machines - not working. -* Cr-Services-Chromoting -* Cr-Platform-Extensions: Issues extensions loading / not loading / hanging. -* Cr-Platform-Extensions-API: Issues with network related extension APIs should - have this label. chrome.webRequest is the big one, I believe, but there are - others. -* Cr-Internals-Plugins-Pepper[-SDK] -* Cr-UI-Browser-Omnibox: Basically any issue with the omnibox. URLs being - treated as search queries rather than navigations, dropdown results being - weird, not handling certain unicode characters, etc. If the issue is new - TLDs not being recognized by the omnibox, that's due to Chrome's TLD list - being out of date, and not an omnibox issue. Such TLD issues should be - duped against http://crbug.com/37436. -* Cr-Internals-Media-Network: Issues related to media. These often run into - the 6 requests per hostname issue, and also have fun interactions with the - cache, particularly in the range request case. -* Cr-Internals-Plugins-PDF: Issues loading pdf files. These are often related - to range requests, which also have some logic at the Internals-Network-Cache - layer. -* Cr-UI-Browser-Navigation -* Cr-UI-Browser-History: Issues which only appear with forward/back navigation. -* Cr-OS-Systems-Network / Cr-OS-Systems-Mobile / Cr-OS-Systems-Bluetooth: These - should be used for issues with ChromeOS's platform network code, and not - net/ issues on ChromeOS. -* Cr-Blink-SecurityFeature: CORS / Cross origin issues. Main frame - cross-origin navigation issues are often actually Cr-UI-Browser-Navigation - issues. -* Cr-Privacy: Privacy related bug (History, cookies discoverable by an entity - that shouldn't be able to do so, incognito state being saved in memory or on - disk beyond the lifetime of incognito tabs, etc). Generally used in - conjunction with other labels. -* Type-Bug-Security: Security related bug (Allows for code execution from - remote site, allows crossing security boundaries, unchecked array bounds, - etc). diff --git a/net/docs/bug-triage-suggested-workflow.md b/net/docs/bug-triage-suggested-workflow.md new file mode 100644 index 0000000..4c448bc --- /dev/null +++ b/net/docs/bug-triage-suggested-workflow.md @@ -0,0 +1,230 @@ +# Chrome Network Bug Triage : Suggested Workflow + +[TOC] + +## Looking for new crashers + +1. Go to [go/chromecrash](https://goto.google.com/chromecrash). + +2. For each platform, look through the releases for which releases to + investigate. As per bug-triage.txt, this should be the most recent canary, + the previous canary (if the most recent is less than a day old), and any of + dev/beta/stable that were released in the last couple of days. + +3. For each release, in the "Process Type" frame, click on "browser". + +4. At the bottom of the "Magic Signature" frame, click "limit 1000". Reported + crashers are sorted in decreasing order of the number of reports for that + crash signature. + +5. Search the page for *"net::"*. + +6. For each found signature: + * If there is a bug already filed, make sure it is correctly describing the + current bug (e.g. not closed, or not describing a long-past issue), and + make sure that if it is a *net* bug, that it is labeled as such. + * Ignore signatures that only occur once, as memory corruption can easily + cause one-off failures when the sample size is large enough. + * Ignore signatures that only come from a single client ID, as individual + machine malware and breakage can also easily cause one-off failures. + * Click on the number of reports field to see details of crash. Ignore it + if it doesn't appear to be a network bug. + * Otherwise, file a new bug directly from chromecrash. Note that this may + result in filing bugs for low- and very-low- frequency crashes. That's + ok; the bug tracker is a better tool to figure out whether or not we put + resources into those crashes than a snap judgement when filing bugs. + * For each bug you file, include the following information: + * The backtrace. Note that the backtrace should not be added to the + bug if Restrict-View-Google isn't set on the bug as it may contain + PII. Filing the bug from the crash reporter should do this + automatically, but check. + * The channel in which the bug is seen (canary/dev/beta/stable), its + frequency in that channel, and its rank among crashers in the + channel. + * The frequency of this signature in recent releases. This information + is available by: + 1. Clicking on the signature in the "Magic Signature" list + 2. Clicking "Edit" on the dremel query at the top of the page + 3. Removing the "product.version='X.Y.Z.W' AND" string and clicking + "Update". + 4. Clicking "Limit 1000" in the Product Version list in the + resulting page (without this, the listing will be restricted to + the releases in which the signature is most common, which will + often not include the canary/dev release being investigated). + 5. Choose some subset of that list, or all of it, to include in the + bug. Make sure to indicate if there is a defined point in the + past before which the signature is not present. + +## Identifying unlabeled network bugs on the tracker + +* Look at new uncomfirmed bugs since noon PST on the last triager's rotation. + [Use this issue tracker + query](https://code.google.com/p/chromium/issues/list?can=2&q=status%3Aunconfirmed&sort=-id&num=1000). + +* Press **h** to bring up a preview of the bug text. + +* Use **j** and **k** to advance through bugs. + +* If a bug looks like it might be network/download/safe-browsing related, + middle click (or command-click on OSX) to open in new tab. + +* If a user provides a crash ID for a crasher for a bug that could be + net-related, look at the crash stack at + [go/crash](https://goto.google.com/crash), and see if it looks to be network + related. Be sure to check if other bug reports have that stack trace, and + mark as a dupe if so. Even if the bug isn't network related, paste the stack + trace in the bug, so no one else has to look up the crash stack from the ID. + * If there's no other information than the crash ID, ask for more details + and add the Needs-Feedback label. + +* If network causes are possible, ask for a net-internals log (If it's not a + browser crash) and attach the most specific internals-network label that's + applicable. If there isn't an applicable narrower label, a clear owner for + the issue, or there are multiple possibilities, attach the internals-network + label and proceed with further investigation. + +* If non-network causes also seem possible, attach those labels as well. + +## Investigating Cr-Internals-Network bugs + +* It's recommended that while on triage duty, you subscribe to the + Cr-Internals-Network label. To do this, go to + https://code.google.com/p/chromium/issues/ and click on "Subscriptions". + Enter "Cr-Internals-Network" and click submit. + +* Look through uncomfirmed and untriaged Cr-Internals-Network bugs, + prioritizing those updated within the last week. [Use this issue tracker + query](https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3DInternals-Network+-status%3AAssigned+-status%3AStarted+-status%3AAvailable+&sort=-modified). + +* If more information is needed from the reporter, ask for it and add the + Needs-Feedback label. If the reporter has answered an earlier request for + information, remove that label. + +* While investigating a new issue, change the status to Untriaged. + +* If a bug is a potential security issue (Allows for code execution from remote + site, allows crossing security boundaries, unchecked array bounds, etc) mark + it Type-Bug-Security. If it has privacy implication (History, cookies + discoverable by an entity that shouldn't be able to do so, incognito state + being saved in memory or on disk beyond the lifetime of incognito tabs, etc), + mark it Cr-Privacy. + +* For bugs that already have a more specific network label, go ahead and remove + the Cr-Internals-Network label and move on. + +* Try to figure out if it's really a network bug. See common non-network + labels section for description of common labels needed for issues incorrectly + tagged as Cr-Internals-Network. + +* If it's not, attach appropriate labels and go no further. + +* If it may be a network bug, attach additional possibly relevant labels if + any, and continue investigating. Once you either determine it's a + non-network bug, or figure out accurate more specific network labels, your + job is done, though you should still ask for a net-internals dump if it seems + likely to be useful. + +* Note that ChromeOS-specific network-related code (Captive portal detection, + connectivity detection, login, etc) may not all have appropriate more + specific labels, but are not in areas handled by the network stack team. + Just make sure those have the OS-Chrome label, and any more specific labels + if applicable, and then move on. + +* Gather data and investigate. + * Remember to add the Needs-Feedback label whenever waiting for the user to + respond with more information, and remove it when not waiting on the + user. + * Try to reproduce locally. If you can, and it's a regression, use + src/tools/bisect-builds.py to figure out when it regressed. + * Ask more data from the user as needed (net-internals dumps, repro case, + crash ID from about:crashes, run tests, etc). + * If asking for an about:net-internals dump, provide this link: + https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-details. + Can just grab the link from about:net-internals, as needed. + +* Try to figure out what's going on, and which more specific network label is + most appropriate. + +* If it's a regression, browse through the git history of relevant files to try + and figure out when it regressed. CC authors / primary reviewers of any + strongly suspect CLs. + +* If you are having trouble with an issue, particularly for help understanding + net-internals logs, email the public net-dev@chromium.org list for help + debugging. If it's a crasher, or for some other reason discussion needs to + be done in private, use chrome-network-debugging@google.com. TODO(mmenke): + Write up a net-internals tips and tricks docs. + +* If it appears to be a bug in the unowned core of the network stack (i.e. no + sublabel applies, or only the Cr-Internals-Network-HTTP sublabel applies, and + there's no clear owner), try to figure out the exact cause. + +## Monitoring UMA histograms and gasper alerts + +For each Gasper alert that fires, determine if it's a real alert and file a bug +if so. + +* Don't file if the alert is coincident with a major volume change. The volume + at a particular date can be determined by hovering the mouse over the + appropriate location on the alert line. + +* Don't file if the alert is on a graph with very low volume (< ~200 data + points); it's probably noise, and we probably don't care even if it isn't. + +* Don't file if the graph is really noisy (but eyeball it to decide if there is + an underlying important shift under the noise). + +* Don't file if the alert is in the "Known Ignorable" list: + * SimpleCache on Windows + * DiskCache on Android. + +For each Gasper alert, respond to chrome-network-debugging@google.com with a +summary of the action you've taken and why, including issue link if an issue +was filed. + +## Investigating crashers + +* Only investigate crashers that are still occurring, as identified by above + section. If a search on go/crash indicates a crasher is no longer occurring, + mark it as WontFix. + +* Particularly for Windows, look for weird dlls associated with the crashes. + If there are some, it may be caused by malware. You can often figure out if + a dll is malware by a search, though it's harder to figure out if a dll is + definitively not malware. + +* See if the same users are repeatedly running into the same issue. This can + be accomplished by search for (Or clicking on) the client ID associated with + a crash report, and seeing if there are multiple reports for the same crash. + If this is the case, it may be also be malware, or an issue with an unusual + system/chrome/network config. + +* Dig through crash reports to figure out when the crash first appeared, and + dig through revision history in related files to try and locate a suspect CL. + TODO(mmenke): Add more detail here. + +* Load crash dumps, try to figure out a cause. See + http://www.chromium.org/developers/crash-reports for more information + +## Dealing with old bugs + +* For all network issues (Even those with owners, or a more specific labels): + + * If the issue has had the Needs-Feedback label for over a month, verify it + is waiting on feedback from the user. If not, remove the label. + Otherwise, go ahead and mark the issue WontFix due to lack of response + and suggest the user file a new bug if the issue is still present. [Use + this issue tracker query for old Needs-Feedback + issues](https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3AInternals-Network%20Needs=Feedback+modified-before%3Atoday-30&sort=-modified). + + * If a bug is over 2 months old, and the underlying problem was never + reproduced or really understood: + * If it's over a year old, go ahead and mark the issue as Archived. + * Otherwise, ask reporters if the issue is still present, and attach + the Needs-Feedback label. + +* Old unconfirmed or untriaged Cr-Internals-Network issues can be investigated + just like newer ones. Crashers should generally be given higher priority, + since we can verify if they still occur, and then newer issues, as they're + more likely to still be present, and more likely to have a still responsive + bug reporter. diff --git a/net/docs/bug-triage-suggested-workflow.txt b/net/docs/bug-triage-suggested-workflow.txt deleted file mode 100644 index a01849d..0000000 --- a/net/docs/bug-triage-suggested-workflow.txt +++ /dev/null @@ -1,184 +0,0 @@ -Look for new crashers: -* Go to go/chromecrash. -* For each platform, look through the releases for which releases to - investigate. As per bug-triage.txt, this should be the - most recent canary, the previous canary (if the most recent is less - than a day old), and any of dev/beta/stable that were released in the - last couple of days. -* For each release, in the "Process Type" frame, click on "browser". -* At the bottom of the "Magic Signature" frame, click "limit 1000". - Reported crashers are sorted in decreasing order of the number of reports for - that crash signature. -* Search the page for "net::". -* For each found signature: - * If there is a bug already filed, make sure it is correctly - describing the current bug (e.g. not closed, or not describing a - long-past issue), and make sure that if it is a net:: bug, that - it is labeled as such. - * Ignore signatures that only occur once, as memory corruption can - easily cause one-off failures when the sample size is large - enough. - * Ignore signatures that only come from a single client ID, as - individual machine malware and breakage can also easily cause - one-off failures. - * Click on the number of reports field to see details of - crash. Ignore it if it doesn't appear to be a network bug. - * Otherwise, file a new bug directly from chromecrash. Note that - this may result in filing bugs for low- and very-low- frequency - crashes. That's ok; the bug tracker is a better tool to figure - out whether or not we put resources into those crashes than a snap - judgement when filing bugs. -* For each bug you file, include the following information: - * The backtrace. Note that the backtrace should not be added to the - bug if Restrict-View-Google isn't set on the bug as it may contain - PII. Filing the bug from the crash reporter should do this - automatically, but check. - * The channel in which the bug is seen (canary/dev/beta/stable), - its frequency in that channel, and its rank among crashers in the channel. - * The frequency of this signature in recent releases. This - information is available by: - * Clicking on the signature in the "Magic Signature" list - * Clicking "Edit" on the dremel query at the top of the page - * Removing the "product.version='X.Y.Z.W' AND" string and clicking - "Update". - * Clicking "Limit 1000" in the Product Version list in the - resulting page (without this, the listing will be restricted to - the releases in which the signature is most common, which will - often not include the canary/dev release being investigated). - * Choose some subset of that list, or all of it, to include in the - bug. Make sure to indicate if there is a defined point in the - past before which the signature is not present. - -Identifying unlabeled network bugs on the tracker: -* Look at new uncomfirmed bugs since noon PST on the last triager's rotation: - https://code.google.com/p/chromium/issues/list?can=2&q=status%3Aunconfirmed&sort=-id&num=1000 -* Press "h" to bring up a preview of the bug text. -* Use "j" and "k" to advance through bugs. -* If a bug looks like it might be network/download/safe-browsing related, middle - click [or command-click on OSX] to open in new tab. -* If a user provides a crash ID for a crasher for a bug that could be - net-related, look at the crash stack at go/crash, and see if it looks to be - network related. Be sure to check if other bug reports have that stack - trace, and mark as a dupe if so. Even if the bug isn't network related, - paste the stack trace in the bug, so no one else has to look up the crash - stack from the ID. - * If there's no other information than the crash ID, ask for more details and - add the Needs-Feedback label. -* If network causes are possible, ask for a net-internals log (If it's not a - browser crash) and attach the most specific internals-network label that's - applicable. If there isn't an applicable narrower label, a clear owner for - the issue, or there are multiple possibilities, attach the internals-network - label and proceed with further investigation. -* If non-network causes also seem possible, attach those labels as well. - -Investigating Cr-Internals-Network bugs: -* It's recommended that while on triage duty, you subscribe to the - Cr-Internals-Network label. To do this, go to - https://code.google.com/p/chromium/issues/ and click on "Subscriptions". - Enter Cr-Internals-Network and click submit. -* Look through uncomfirmed and untriaged Cr-Internals-Network bugs, prioritizing - those updated within the last week: - https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3DInternals-Network+-status%3AAssigned+-status%3AStarted+-status%3AAvailable+&sort=-modified -* If more information is needed from the reporter, ask for it and - add the 'Needs-Feedback' label. If the reporter has answered an - earlier request for information, remove that label. -* While investigating a new issue, change the status to Untriaged. -* If a bug is a potential security issue (Allows for code execution from remote - site, allows crossing security boundaries, unchecked array bounds, etc) mark - it Type-Bug-Security. If it has privacy implication (History, cookies - discoverable by an entity that shouldn't be able to do so, incognito state - being saved in memory or on disk beyond the lifetime of incognito tabs, - etc), mark it Cr-Privacy. -* For bugs that already have a more specific network label, go ahead and remove - the Cr-Internals-Network label and move on. -* Try to figure out if it's really a network bug. See common non-network labels - section for description of common labels needed for issues incorrectly - tagged as Cr-Internals-Network. -* If it's not, attach appropriate labels and go no further. -* If it may be a network bug, attach additional possibly relevant labels if any, - and continue investigating. Once you either determine it's a non-network - bug, or figure out accurate more specific network labels, your job is done, - though you should still ask for a net-internals dump if it seems likely to - be useful. -* Note that ChromeOS-specific network-related code (Captive portal detection, - connectivity detection, login, etc) may not all have appropriate more - specific labels, but are not in areas handled by the network stack team. - Just make sure those have the OS-Chrome label, and any more specific labels - if applicable, and then move on. -* Gather data and investigate. - * Remember to add the Needs-Feedback label whenever waiting for the user to - respond with more information, and remove it when not waiting on the user. - * Try to reproduce locally. If you can, and it's a regression, use - src/tools/bisect-builds.py to figure out when it regressed. - * Ask more data from the user as needed (net-internals dumps, repro case, - crash ID from about:crashes, run tests, etc). - * If asking for an about:net-internals dump, provide this link: - https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-details. - Can just grab the link from about:net-internals, as needed. -* Try to figure out what's going on, and which more specific network label is - most appropriate. -* If it's a regression, browse through the git history of relevant files to try - and figure out when it regressed. CC authors / primary reviewers of any - strongly suspect CLs. -* If you are having trouble with an issue, particularly for help understanding - net-internals logs, email the public net-dev@chromium.org list for help - debugging. If it's a crasher, or for some other reason discussion needs to - be done in private, use chrome-network-debugging@google.com. - TODO(mmenke): Write up a net-internals tips and tricks docs. -* If it appears to be a bug in the unowned core of the network stack (i.e. no - sublabel applies, or only the Cr-Internals-Network-HTTP sublabel applies, - and there's no clear owner), try to figure out the exact cause. - -Monitor UMA histograms and gasper alerts. For each Gasper alert that -fires, determine if it's a real alert and file a bug if so. -* Don't file if the alert is coincident with a major volume change. - The volume at a particular date can be determined by hovering the - mouse over the appropriate location on the alert line. -* Don't file if the alert is on a graph with very low volume (< ~200 - data points); it's probably noise, and we probably don't care even - if it isn't. -* Don't file if the graph is really noisy (but eyeball it to decide if - there is an underlying important shift under the noise). -* Don't file if the alert is in the "Known Ignorable" list: - * SimpleCache on Windows - * DiskCache on Android. -For each Gasper alert, respond to chrome-network-debugging@ with a -summary of the action you've taken and why, including issue link if an -issue was filed. - -Investigating crashers: -* Only investigate crashers that are still occurring, as identified by above - section. If a search on go/crash indicates a crasher is no longer - occurring, mark it as WontFix. -* Particularly for Windows, look for weird dlls associated with the crashes. - If there are some, it may be caused by malware. You can often figure out if - a dll is malware by a search, though it's harder to figure out if a dll is - definitively not malware. -* See if the same users are repeatedly running into the same issue. This can be - accomplished by search for (Or clicking on) the client ID associated with a - crash report, and seeing if there are multiple reports for the same crash. - If this is the case, it may be also be malware, or an issue with an unusual - system/chrome/network config. -* Dig through crash reports to figure out when the crash first appeared, and dig - through revision history in related files to try and locate a suspect CL. - TODO(mmenke): Add more detail here. -* Load crash dumps, try to figure out a cause. - See http://www.chromium.org/developers/crash-reports for more information - -Dealing with old bugs: -* For all network issues (Even those with owners, or a more specific labels): - * If the issue has had the Needs-Feedback label for over a month, verify it - is waiting on feedback from the user. If not, remove the label. - Otherwise, go ahead and mark the issue WontFix due to lack of response and - suggest the user file a new bug if the issue is still present. - Old Needs-Feedback issues: https://code.google.com/p/chromium/issues/list?can=2&q=Cr%3AInternals-Network%20Needs=Feedback+modified-before%3Atoday-30&sort=-modified - * If a bug is over 2 months old, and the underlying problem was never - reproduced or really understood: - * If it's over a year old, go ahead and mark the issue as Archived. - * Otherwise, ask reporters if the issue is still present, and attach the - Needs-Feedback label. -* Old unconfirmed or untriaged Cr-Internals-Network issues can be investigated - just like newer ones. Crashers should generally be given higher priority, - since we can verify if they still occur, and then newer issues, as they're - more likely to still be present, and more likely to have a still responsive - bug reporter. diff --git a/net/docs/bug-triage.md b/net/docs/bug-triage.md new file mode 100644 index 0000000..58d4c749 --- /dev/null +++ b/net/docs/bug-triage.md @@ -0,0 +1,98 @@ +# Chrome Network Bug Triage + +The Chrome network team uses a two day bug triage rotation. The main goals are +to identify and label new network bugs, and investigate network bugs when no +label seems suitable. + +## Responsibilities + +### Required: +* Identify new crashers +* Identify new network issues. +* Request data about recent Cr-Internals-Network issue. +* Investigate each recent Cr-Internals-Network issue. +* Monitor UMA histograms and gasper alerts. + +### Best effort: +* Investigate unowned and owned-but-forgotten net/ crashers +* Investigate old bugs +* Close obsolete bugs. + +All of the above is to be done on each rotation. These responsibilities should +be tracked, and anything left undone at the end of a rotation should be handed +off to the next triager. The downside to passing along bug investigations like +this is each new triager has to get back up to speed on bugs the previous +triager was investigating. The upside is that triagers don't get stuck +investigating issues after their time after their rotation, and it results in a +uniform, predictable two day commitment for all triagers. + +## Details + +### Required: + +* Identify new crashers that are potentially network related. You should check + the most recent canary, the previous canary (if the most recent less than a + day old), and any of dev/beta/stable that were released in the last couple of + days, for each platform. File Cr-Internals-Network bugs on the tracker when + new crashers are found. + +* Identify new network bugs, both on the bug tracker and on the crash server. + All Unconfirmed issues filed during your triage rotation should be scanned, + and, for suspected network bugs, a network label assigned. A triager is + responsible for looking at bugs reported from noon PST / 3:00 pm EST of the + last day of the previous triager's rotation until the same time on the last + day of their rotation. + +* Investigate each recent (new comment within the past week or so) + Cr-Internals-Network issue, driving getting information from reporters as + needed, until you can do one of the following: + + * Mark it as *WontFix* (working as intended, obsolete issue) or a + duplicate. + + * Mark it as a feature request. + + * Remove the Cr-Internals-Network label, replacing it with at least one + more specific network label or non-network label. Promptly adding + non-network labels when appropriate is important to get new bugs in front + of someone familiar with the relevant code, and to remove them from the + next triager's radar. Because of the way the bug report wizard works, a + lot of bugs incorrectly end up with the network label. + + * The issue is assigned to an appropriate owner. + + * If there is no more specific label for a bug, it should be investigated + until we have a good understanding of the cause of the problem, and some + idea how it should be fixed, at which point its status should be set to + Available. Future triagers should ignore bugs with this status, unless + investigating stale bugs. + +* Monitor UMA histograms and gasper alerts. + + * For each Gasper alert that fires, the triager should determine if the + alert is real (not due to noise), and file a bug with the appropriate + label if so. Note that if no label more specific than + Cr-Internals-Network is appropriate, the responsibility remains with the + triager to continue investigating the bug, as above. + +### Best Effort (As you have time): + +* Investigate unowned and owned but forgotten net/ crashers that are still + occurring (As indicated by + [go/chromecrash](https://goto.google.com/chromecrash)), prioritizing frequent + and long standing crashers. + +* Investigate old bugs, prioritizing the most recent. + +* Close obsolete bugs. + +If you've investigated an issue (in code you don't normally work on) to an +extent that you know how to fix it, and the fix is simple, feel free to take +ownership of the issue and create a patch while on triage duty, but other tasks +should take priority. + +See [bug-triage-suggested-workflow.md](bug-triage-suggested-workflow.md) for +suggested workflows. + +See [bug-triage-labels.md](bug-triage-labels.md) for labeling tips for network +and non-network bugs. diff --git a/net/docs/bug-triage.txt b/net/docs/bug-triage.txt deleted file mode 100644 index c2341f1..0000000 --- a/net/docs/bug-triage.txt +++ /dev/null @@ -1,79 +0,0 @@ -The Chrome network team uses a two day bug triage rotation. The main goals are -to identify and label new network bugs, and investigate network bugs when no -label seems suitable. - -Responsibilities - -Required: -* Identify new crashers -* Identify new network issues. -* Request data about recent Cr-Internals-Network issue. -* Investigate each recent Cr-Internals-Network issue. -* Monitor UMA histograms and gasper alerts. - -Best effort: -* Investigate unowned and owned-but-forgotten net/ crashers -* Investigate old bugs -* Close obsolete bugs. - -All of the above is to be done on each rotation. These -responsibilities should be tracked, and anything left undone at the -end of a rotation should be handed off to the next triager. The -downside to passing along bug investigations like this is each new -triager has to get back up to speed on bugs the previous triager was -investigating. The upside is that triagers don't get stuck -investigating issues after their time after their rotation, and it -results in a uniform, predictable two day commitment for all triagers. - -More detail: - -Required activities: -* Identify new crashers that are potentially network related. You should check - the most recent canary, the previous canary (if the most recent less than a - day old), and any of dev/beta/stable that were released in the last couple - of days, for each platform. File Cr-Internals-Network bugs on the tracker - when new crashers are found. -* Identify new network bugs, both on the bug tracker and on the crash server. - All Unconfirmed issues filed during your triage rotation should be scanned, - and, for suspected network bugs, a network label assigned. A triager is - responsible for looking at bugs reported from noon PST / 3:00 pm EST of the - last day of the previous triager's rotation until the same time on the last - day of their rotation. -* Investigate each recent (New comment within the past week or so) - Cr-Internals-Network issue, driving getting information from reporters as - needed, until you can do one of the following: - * Mark it as WontFix (working as intended, obsolete issue) or a duplicate. - * Mark it as a feature request. - * Remove the Cr-Internals-Network label, replacing it with at least one more - specific network label or non-network label. Promptly adding non-network - labels when appropriate is important to get new bugs in front of someone - familiar with the relevant code, and to remove them from the next triager's - radar. Because of the way the bug report wizard works, a lot of bugs - incorrectly end up with the network label. - * The issue is assigned to an appropriate owner. - * If there is no more specific label for a bug, it should be investigated - until we have a good understanding of the cause of the problem, and some - idea how it should be fixed, at which point its status should be set to - Available. Future triagers should ignore bugs with this status, unless - investigating stale bugs. -* Monitor UMA histograms and gasper alerts. - * For each Gasper alert that fires, the triager should determine if - the alert is real (not due to noise), and file a bug with the - appropriate label if so. Note that if no label more specific than - Cr-Internals-Network is appropriate, the responsibility remains - with the triager to continue investigating the bug, as above. - -Best Effort (As you have time): -* Investigate unowned and owned but forgotten net/ crashers that are still - occurring (As indicated by go/chromecrash), prioritizing frequent and long - standing crashers. -* Investigate old bugs, prioritizing the most recent. -* Close obsolete bugs. - -If you've investigated an issue (in code you don't normally work on) to an -extent that you know how to fix it, and the fix is simple, feel free to take -ownership of the issue and create a patch while on triage duty, but other tasks -should take priority. - -See bug-triage-suggested-workflow.txt for suggested workflows. -See bug-triage-labels.txt for labeling tips for network and non-network bugs. |