Sports Reference Blog

Archive for the 'expire120d' Category

New Bot Filtering and Content Delivery Network for Sports Reference Sites

7th October 2022

Our devops department finished our move from AWS's Cloudfront Content Delivery Network (CDN) to Cloudflare's CDN this week. This will be undecipherable technical jargon to most of you, so I'll explain what it means.

When you request a page or file from the site, the request first goes to a CDN that is located geographically close to you and if a neighbor of yours (within several hundred miles) has requested the page earlier in the day, you'll get a cached copy of that page much faster than you would otherwise. If your neighbors are lame and don't use our site or just haven't visited that page, then the request will be passed on to our servers and we'll send you the page.

This has many advantages in that it takes a load off our servers and also makes things faster in general for our users the more the sites are used.

Cloudflare also includes bot filtering with this offering. It actively searches for badly behaved bots and proactively blocks them from our site. We are now utilizing this feature because we get scraped. A LOT. And if your bot is badly behaved then we don't want you impacting the performance of the site for our other users. We've gotten a number of emails around this this week as people who have been scraping us have found themselves blocked.

As of today, this filtering will remain in place for our:

  • Soccer/Football
  • Basketball
  • Hockey
  • College sites

We are turning off the active bot filtering on our baseball and pro football sites at this time. We will re-enable bot filtering the week after the end of their respective seasons.

What is a well-behaved bot? Please see our Data Use page and our Bot Traffic page for guidance. Note that if you are blocked by our servers, the blocks reset after 24 hours, so with some more polite settings on your bot you may be able to try again the next day.

We can not provide an API (per our data licensing agreements), and you should not view as a data provider (on par with SportRadar or Genius Sports). This is a role that we do not and can not support with company resources.

Posted in Announcement, Baseball-Reference.com, Basketball-Reference.com, CBB at Sports Reference, CFB at Sports Reference, Data, expire120d, FBref, Hockey-Reference.com, Pro-Football-Reference.com, Stathead | Comments Off on New Bot Filtering and Content Delivery Network for Sports Reference Sites

New Play Index Data: 1916, 1917 and more 1940’s PBP

18th December 2012

Thanks to the great folks at RetroSheet we are now able to launch box scores and game logs for 1916 and 1917 and additional play-by-play for 1945-1947 along with additional games found here and there for 1948-1973.

For a full list of our data coverage, check out our Data Coverage page.

I have yet to re-run the full site, so the 1916 A's and 1917 retirees will not yet have links to splits or gamelogs, but that should happen later today or tomorrow.

Box Score Directory (1916-2012)

Note: RetroSheet also has 1915 AL and NL box scores, but not yet Federal League boxes. Given our setup we require a full set to be available before we can publish a year's worth of box scores.

Posted in Advanced Stats, Announcement, Baseball-Reference.com, expire120d, History | 3 Comments »

2012-2013 Major League Baseball Free Agent List

2nd November 2012

Looking for a list of 2012-13 MLB Free Agents this offseason? Go to this page:

2012-2013 Free Agents with Statistics

There you'll find a table like this (except with even more stats):

Posted in Announcement, Baseball-Reference.com, expire120d, Features | 2 Comments »