diff options
author | jar@chromium.org <jar@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> | 2009-01-20 20:32:42 +0000 |
---|---|---|
committer | jar@chromium.org <jar@chromium.org@0039d316-1c4b-4281-b951-d872f2087c98> | 2009-01-20 20:32:42 +0000 |
commit | 281d288831f8c41b09669140963059d9f81c5c52 (patch) | |
tree | 8903da6a28ba482cd2b9a83ac7deb354193aaef6 /chrome | |
parent | 0f9f98ab11fe6056636a34ae57602edeef60589f (diff) | |
download | chromium_src-281d288831f8c41b09669140963059d9f81c5c52.zip chromium_src-281d288831f8c41b09669140963059d9f81c5c52.tar.gz chromium_src-281d288831f8c41b09669140963059d9f81c5c52.tar.bz2 |
Add more production logging to diagnose upload problem
It appears that some logs get "stuck" trying to be sent,
so I've changed a number of DLOG() calls to LOG() calls.
I also added a bit of info to the LOG() calls, and made
minor edits.
bug=1505736
r=evanm
Review URL: http://codereview.chromium.org/18354
git-svn-id: svn://svn.chromium.org/chrome/trunk/src@8304 0039d316-1c4b-4281-b951-d872f2087c98
Diffstat (limited to 'chrome')
-rw-r--r-- | chrome/browser/metrics/metrics_service.cc | 56 |
1 files changed, 29 insertions, 27 deletions
diff --git a/chrome/browser/metrics/metrics_service.cc b/chrome/browser/metrics/metrics_service.cc index e7fefe5..e830659 100644 --- a/chrome/browser/metrics/metrics_service.cc +++ b/chrome/browser/metrics/metrics_service.cc @@ -17,21 +17,21 @@ // memory statistics, snapshotting the current stats of numerous histograms, // closing the logs, translating to XML text, and compressing the results for // transmission. Transmission includes submitting a compressed log as data in a -// URL-get, and retransmitting (or retaining at process termination) if the +// URL-post, and retransmitting (or retaining at process termination) if the // attempted transmission failed. Retention across process terminations is done // using the the PrefServices facilities. The format for the retained // logs (ones that never got transmitted) is always the uncompressed textual // representation. // -// Logs fall into one of two categories: "Initial logs," and "ongoing logs." -// There is at most one initial log sent for each complete run of the chrome +// Logs fall into one of two categories: "initial logs," and "ongoing logs." +// There is at most one initial log sent for each complete run of the chromium // product (from startup, to browser shutdown). An initial log is generally // transmitted some short time (1 minute?) after startup, and includes stats // such as recent crash info, the number and types of plugins, etc. The -// external server's response to the initial log conceptually tells -// this MS if it should continue transmitting logs (during this session). The -// server response can actually be much more detailed, and always includes (at -// a minimum) how often additional ongoing logs should be sent. +// external server's response to the initial log conceptually tells this MS if +// it should continue transmitting logs (during this session). The server +// response can actually be much more detailed, and always includes (at a +// minimum) how often additional ongoing logs should be sent. // // After the above initial log, a series of ongoing logs will be transmitted. // The first ongoing log actually begins to accumulate information stating when @@ -40,14 +40,15 @@ // much in the way of user stats. The most common interlog period (delay) // is 20 minutes. That time period starts when the first user action causes a // logging event. This means that if there is no user action, there may be long -// periods without any (ongoing) log transmissions. Ongoing log typically +// periods without any (ongoing) log transmissions. Ongoing logs typically // contain very detailed records of user activities (ex: opened tab, closed // tab, fetched URL, maximized window, etc.) In addition, just before an // ongoing log is closed out, a call is made to gather memory statistics. Those // memory statistics are deposited into a histogram, and the log finalization // code is then called. In the finalization, a call to a Histogram server // acquires a list of all local histograms that have been flagged for upload -// to the UMA server. +// to the UMA server. The finalization also acquires a the most recent number +// of page loads, along with any counts of renderer or plugin crashes. // // When the browser shuts down, there will typically be a fragment of an ongoing // log that has not yet been transmitted. At shutdown time, that fragment @@ -135,7 +136,7 @@ // of old unsent logs (the appropriate list, depending on whether it is an // initial log, or an ongoing log). An addition, any log that is currently // accumulating is also finalized, and pushed into the unsent log list. With -// those pushed performed, we regress back to the SEND_OLD_INITIAL_LOGS state in +// those pushes performed, we regress back to the SEND_OLD_INITIAL_LOGS state in // case the user enables log recording again during this session. This way // anything we have "pushed back" will be sent automatically if/when we progress // back to SENDING_CURRENT_LOG state. @@ -149,7 +150,7 @@ // twice. For example, if a log file is sent, but not yet acknowledged by // the external server, and the user shuts down, then a copy of the log may be // saved for re-transmission. These duplicates could be filtered out server -// side, but are not expected to be a significantly statistical problem. +// side, but are not expected to be a significant problem. // // //------------------------------------------------------------------------------ @@ -217,7 +218,7 @@ static const int kUnsentLogDelay = 15; // 15 seconds // sending the next log. If the channel is busy, such as when there is a // failure during an attempt to transmit a previous log, then a log may wait // (and continue to accrue now log entries) for a much greater period of time. -static const int kMinSecondsPerLog = 20 * 60; // twenty minutes +static const int kMinSecondsPerLog = 90; // 20 * 60; // Twenty minutes. // When we don't succeed at transmitting a log to a server, we progressively // wait longer and longer before sending the next log. This backoff process @@ -238,7 +239,7 @@ static const int kSaveStateInterval = 5 * 60; // five minutes static const size_t kMaxInitialLogsPersisted = 20; // The number of ongoing logs we're willing to save persistently, and hope to -// send during a this or future sessions. Note that each log will be pretty +// send during a this or future sessions. Note that each log may be pretty // large, as presumably the related "initial" log wasn't sent (probably nothing // was, as the user was probably off-line). As a result, the log probably kept // accumulating while the "initial" log was stalled (pending_), and couldn't be @@ -246,7 +247,7 @@ static const size_t kMaxInitialLogsPersisted = 20; // A "standard shutdown" will create a small log, including just the data that // was not yet been transmitted, and that is normal (to have exactly one // ongoing_log_ at startup). -static const size_t kMaxOngoingLogsPersisted = 4; +static const size_t kMaxOngoingLogsPersisted = 8; // Handles asynchronous fetching of memory details. @@ -480,7 +481,7 @@ void MetricsService::Observe(NotificationType type, HandleIdleSinceLastTransmission(false); if (current_log_) - DLOG(INFO) << "METRICS: NUMBER OF LOGS = " << current_log_->num_events(); + DLOG(INFO) << "METRICS: NUMBER OF EVENTS = " << current_log_->num_events(); } void MetricsService::HandleIdleSinceLastTransmission(bool in_idle) { @@ -640,6 +641,7 @@ void MetricsService::SaveLocalState() { RecordCurrentState(pref); pref->ScheduleSavePersistentPrefs(g_browser_process->file_thread()); + // TODO(jar): Does this run down the batteries???? ScheduleNextStateSave(); } @@ -739,6 +741,8 @@ void MetricsService::PushPendingLogsToUnsentLists() { unsent_initial_logs_.push_back(pending_log_text_); state_ = SEND_OLD_INITIAL_LOGS; } else { + // TODO(jar): Verify correctness in other states, including sending unsent + // iniitial logs. PushPendingLogTextToUnsentOngoingLogs(); } DiscardPendingLog(); @@ -1144,7 +1148,7 @@ void MetricsService::OnURLFetchComplete(const URLFetcher* source, current_fetch_.reset(NULL); // We're not allowed to re-use it. // Confirm send so that we can move on. - DLOG(INFO) << "METRICS RESPONSE CODE: " << response_code << " status=" << + LOG(INFO) << "METRICS RESPONSE CODE: " << response_code << " status=" << StatusToString(status); // TODO(petersont): Refactor or remove the following so that we don't have to @@ -1157,9 +1161,11 @@ void MetricsService::OnURLFetchComplete(const URLFetcher* source, } if (response_code != 200) { + LOG(INFO) << "METRICS: transmission attempt returned a failure code: " + << response_code << ". Verify network connectivity"; HandleBadResponseCode(); } else { // Success. - DLOG(INFO) << "METRICS RESPONSE DATA: " << data; + LOG(INFO) << "METRICS RESPONSE DATA: " << data; switch (state_) { case INITIAL_LOG_READY: state_ = SEND_OLD_INITIAL_LOGS; @@ -1210,14 +1216,10 @@ void MetricsService::OnURLFetchComplete(const URLFetcher* source, } void MetricsService::HandleBadResponseCode() { - DLOG(INFO) << "METRICS: transmission attempt returned a failure code. " - "Verify network connectivity"; -#ifndef NDEBUG - DLOG(INFO) << "Verify your metrics logs are formatted correctly. " + LOG(INFO) << "Verify your metrics logs are formatted correctly. " "Verify server is active at " << kMetricsURL; -#endif if (!pending_log()) { - DLOG(INFO) << "METRICS: Recorder shutdown during log transmission."; + LOG(INFO) << "METRICS: Recorder shutdown during log transmission."; } else { // Send progressively less frequently. DCHECK(kBackoff > 1.0); @@ -1230,7 +1232,7 @@ void MetricsService::HandleBadResponseCode() { TimeDelta::FromSeconds(kMinSecondsPerLog); } - DLOG(INFO) << "METRICS: transmission retry being scheduled in " << + LOG(INFO) << "METRICS: transmission retry being scheduled in " << interlog_duration_.InSeconds() << " seconds for " << pending_log_text_; } @@ -1240,11 +1242,11 @@ void MetricsService::GetSettingsFromResponseData(const std::string& data) { // We assume that the file is structured as a block opened by <response> // and that inside response, there is a block opened by tag <chrome_config> // other tags are ignored for now except the content of <chrome_config>. - DLOG(INFO) << "METRICS: getting settings from response data: " << data; + LOG(INFO) << "METRICS: getting settings from response data: " << data; int data_size = static_cast<int>(data.size()); if (data_size < 0) { - DLOG(INFO) << "METRICS: server response data bad size: " << data_size << + LOG(INFO) << "METRICS: server response data bad size: " << data_size << "; aborting extraction of settings"; return; } @@ -1252,7 +1254,7 @@ void MetricsService::GetSettingsFromResponseData(const std::string& data) { DCHECK(doc); // If the document is malformed, we just use the settings that were there. if (!doc) { - DLOG(INFO) << "METRICS: reading xml from server response data failed"; + LOG(INFO) << "METRICS: reading xml from server response data failed"; return; } |