Open in App
  • Local
  • Headlines
  • Election
  • Sports
  • Lifestyle
  • Education
  • Real Estate
  • Newsletter
  • IBWAA

    Predictive analysis in Major League Baseball

    2024-04-30
    By W. H. Johnson

    MLB Network host and longtime baseball commentator Brian Kenny delivered the keynote address at the 2024 SABR Analytics Conference in Phoenix. Always an engaging and dynamic speaker, Kenny brought his journalist’s eye to the topic of predictive analytics when he posed the question, “What Happened to the 2023 San Diego Padres?”

    That team, managed by a former Manager of the Year, with a roster replete with All Star and Cy Young award-level pitchers as well as stars with names like Soto, Bogaerts, Tatis (Jr.), and Machado, needed a five-game winning streak to close the season in order to salvage an 82-80 record and a third-place finish in the NL West. The team that every CBS Sports prognosticator – smart people with plenty of baseball experience – picked to win the division, did not even qualify for the playoffs. Within Kenny’s words was an unavoidable question: How had the analysts missed so badly, and so universally?

    It was a simple inquiry from Kenny, posed in support of his contention that, perhaps, we’re still not collectively measuring and analyzing the most relevant data, in part, because we’re drilling so deeply into areas with which the analytics community is collectively comfortable. Thus the debate over the relative value of advanced analytics in baseball continues, at least in the media (social and otherwise) and the newsletter/blogosphere.

    The crux of the sports analytics movement, as often espoused (or groused about, depending on who is writing) in the press, began with the use of quantitative analysis to understand how to better value players and performance, and thus optimize roster construction and payroll. Analytics now appear more valued in terms of predictive modeling of those player performances within a nearly infinite combination of circumstances.

    The desire to best position, and then to pay, a roster of players, thus putting a team in a position from which it is poised to win at the greatest value, is the figurative pot of the Leprechaun’s gold at the end of the rainbow for every baseball executive. It is (or should be) the dream of every General Manager to be able to give his scouting and on-field leaders the ultimate toolbox, with a plan for each and every player, so that the entire organization may succeed.

    But analytics in general continues to absorb criticism from fans and some pundits. There is an unofficially defined border between the old-school baseball people, those who played and coached at one time or another, who earned their expertise through experience, and what one former scout calls the “skinny leg jeans” crowd. The latter stereotype, in his view, applies to the current and future generation(s) of young quantitative analysts that just about every team now employs, economics majors and statisticians who seek to apply econometric modeling skills to the Big Data galaxy of baseball.

    Part of the friction between the two camps – evidently – stems from both talking past each other. The baseball lifers trust their eyes and experiences gathered over decades in the game, but do not necessarily believe that the entirety of a player can be reduced to measurements and extrapolations. The newer group, for lack of a better term, generally possesses a near-genius ability to create programs that correlate and parse an incredibly wide range of collected data, mostly in the search for the key factors that, if manipulated properly, will produce optimum results for the lowest investment. As most reading this will likely agree, both are necessary.

    Just about every statistical model can, and generally is, used for prediction regardless of the field of inquiry. In baseball, the longer you’ve been a fan, the longer you’ve used some form of predictive model. From the old back-of-the-baseball-card calculations (e.g. “Yaz is batting .338, but hasn’t had a hit in two days, so he’s probably due for one today”) to identifying extremely sophisticated data relationships, the reliance on a particular player demonstrating a consistency over time is vital to such guesses.

    https://img.particlenews.com/image.php?url=0Di6Mn_0sjH0reV00
    Brian KennyPhoto byMLB Network

    One group of models, the class with which most of us are familiar (even if we never took a single class in statistics) is parametric modeling. In that class, models are founded on stipulated assumptions about the selected population. But assumptions are only a substitute for facts. In our case, big league ball players and established playing fields offer the illusion that all of baseball can, ultimately, be observed, measured, and analyzed.

    However, the models are, essentially, just gross (large) generalizations. A curve might spin at one rate on Tuesday and another on Saturday. A batter might be distracted one night and completely focused a week later. A callus or small bruise may exert a slight but discernable influence on an at-bat. Over the course of a season that batter may post 700 plate appearances, large numbers in baseball but small if trying to precisely model that player-versus-pitcher-in-particular-stadium for the purpose of lineup position and the like. In other words, as much as we believe that we can capture every action on the diamond, at best we can capture only results, and drawing hard conclusions from that limited model and data is, to quote the poets, fraught with peril.

    Is there value in traditional scouting and evaluation? Absolutely. Analytical modeling and predictions? Equally critical.

    In the aforementioned conference remarks, Brian Kenny was asking a question that should be used to frame every decision and tactic used to win on the field. What information should be collected and analyzed to best inform every person in the organization? Predictive analytics certainly have their place, and so much room to grow, but they will always remain part of the decision team, not the sole player in the field, and are only useful to the degree that the data and models are viable.

    IBWAA member W.H. “Bill” Johnson has contributed to SABR’s Biography Project, written extensively on baseball history, and presented papers at related conferences. Bill and his wife Chris currently reside in Georgia. He can be contacted on Twitter: @BaseballStoic.


    Expand All
    Comments /
    Add a Comment
    YOU MAY ALSO LIKE
    Local News newsLocal News

    Comments / 0