Analysis Confusion

Hello,

We're having some trouble making inferences from our S3 Analysis page. I made sure to read the Simpleton's Guide on Webalizer but it wasn't specific enough to answer these questions:

1. We have a huge number of hits to our root directory. We have no links to our root and no files hosted there - only subdirectories are present. How can we explain this?

2. Jakarta Commons accounts for a vast majority of our User Agents, followed by Axis/1.3. What are these agents and what can we infer from their numbers?

3. In the Hits by Response Code section, we have a significant percentage of Code 206 (partial content) returns. What does that code mean and is it a problem?

4. Finally, there is only a slight variance in percentage of hits/data per hour on our Hourly Statistics page. Since we're hosting a financial news website this does not seem to make sense (our peak traffic is during stock market hours). Is there any way to explain that?


Thanks for helping us interpret these numbers and terms.

-Luke

Luke LaVanway - Benzinga Radio
www.benzinga.com
Wednesday, June 1, 2011




No problem. The mystery traffic in your first 2 questions are from bots. Every S3 bucket seems to get that same traffic pattern to their root and /soap folders, probably from a mis-guided spider looking for vulnerabilities in your "web site".

The 206 responses are probably exactly what they describe: partial content. As in, somebody asked for a file, started downloading it, then browsed away or otherwise severed the connection before finishing.

Finally, the chart I see when I look at your stats shows a nice curve peaking during the day and subsiding at night. Exactly what I'd expect, given the audience you describe. It seems that your daily peak is almost 10x that of your lowest point at night, which is more pronounced than most sites.

Jason Kester
Wednesday, June 1, 2011




Thanks for clearing that up Jason.

benzingaradio

Thursday, June 2, 2011




We've got a couple more questions for you guys pertaining to hits that "count" and hits that come from automated sources.

1. A significant portion of our hits are coming from the iTMS user agent, which we understand to mean iTunes Music Store. Are these actual user downloads or just an automated function of iTunes, such as updating the podcast feed in the iTunes Store?

2. The URLs of our .xml files for the RSS feeds are getting a large percentage of our hits - many more than any individual podcast. Do these hit figures mean that a large number of users have downloaded our RSS feeds or are the .xml hits being skewed by bots more than the podcast files are?


Thanks in advance. We're trying to get real hit figures to present to advertisers and this forum is a big help.

Luke LaVanway - Benzinga Radio
www.benzinga.com
Monday, June 6, 2011

[ reply to this topic ]   [ return to topic list ]

© 2024 Expat Software Back to Top