Shiyonofuji/Kaiopectate 2 Posted January 11, 2006 It is Day 4 of 2006 Hatsu Basho. I have already reached my bi-monthly "axis of despair" -- the point at which I ask myself, "Why did I pick THESE guys instead of THOSE guys?" It is also at this point each basho that I find myself telling myself, "The reason that you enter so many online sumo games is that, the more you enter, the more likely you are to do okay in one of them." What I am thinking about is: Projection of future results -- but let us say CAREER results, rather than bout-by-bout results, from initial points in a career. What does a 48-12 W/L record, at Juryo 2 signify for the future? How about a 39-21 at J6? (And in comparison with each other?) It would be very informative, I think, to collect and chart the HISTORIC TRAJECTORIES of careers through time. I hypothesize that the results will be statistically significant. Once you have enough data -- that is, compilation of a number of career records, then the Confidence Interval may be narrowed to the point at which statistically significant conclusions can be drawn, even taking into account major individual events such as injuries. This is somewhat different than what Doitsuyama charted several months ago (but who knows what else he's got in his statistical chanko-pot?). http://www.sumoforum.net/forums/index.php?showtopic=3624 As to whether comparative career projections would assist individual comparisons, I don't know. However, regardless of the initial hypothesis, one should not limit oneself too soon, in terms of data collection. One should collect all available objective (or objectifiable) data, for the purpose of future analysis. For example, one may be able to chart the progress basho-by-basho of several rikishi with identical 35-25 records; but one may have been 180cm tall and 130kg, and another 175cm/150kg, at that time. Then there are the major factors of banzuke rank, at a given W-L record, and age. (And who can say what other findings might be discernible....consistent different results depending upon date; or depending upon ichimon?) And, to widen the scope considerably, this is only one among countless approaches to the subject of prediction. It's likely that people in Japan have engaged in this kind of analysis. Over 20 years ago, in baseball, Bill James first publicized his iconoclastic statistical analysis, which objectively deflated many traditional assumptions (for example, about sacrifice bunts). The Society for American Baseball Research developed "Sabrmetrics" (www.sabr.org). Strategy and management, salaries, and virtually everything in else in baseball has changed as a result. If the wheel hasn't already been invented, I would call the topic of statistical research and analysis of sumo "Sumometrics." Others might call it Stupid. Share this post Link to post Share on other sites
aderechelsea 125 Posted January 11, 2006 i try not to mess with stats in my Sumo gaming because it makes me miss the excitement of the "random guess" or the "gutt feeling" ... i only take in consideration sometimes the recent form and short term face-to-face records, but i always go with instinct in the end ... but i am definately not qualified to talk about sumo gaming since i am pretty mediocre in it .... i am sure there are people around here who will find all these more interesting. ps. i remembered i have another "golden rule" in the Games ... i ALWAYS pick Kaiho as a "winner" in Sekitori-toto no matter who he faces ... (i am sure i had a kuroboshi some days because of that but i cannot make myself click the other option) Share this post Link to post Share on other sites
Jonosuke 28 Posted January 11, 2006 I believe Bill James now works for Red Sox after all his years spent outside the baseball's inner circle. A's Billy Bean is still going strong as his disciple. J P Ricciardi who tends to his flocks not too far from where I live, I guess James is still relevant but it's pretty much of stretch to translate his statical analysis to Ozumo. However in general terms, some of James' argument works for Ozumo as well - for instance some oyakatas go more for College grads than middle schoolers as you never know how they turn out (most of them don't). I suppose you can gather stats on the Kyushu, Nagoya and Osaka results against the Tokyo. As in baseball, a sacrifce bunt never makes sense in sumo either.... Share this post Link to post Share on other sites
Doitsuyama 1,185 Posted January 11, 2006 Shiyonofuji/Kaiopectate, I wish you good luck in trying to factor in height, weight, banzuke ranks, number of children and shoe size. Oh, blood type is of utmost importance! It's not that difficult. Just make an initial analysis to check if those factors are of any statistical relevance for the outcome of bouts at all. Really not that difficult. But you will have to do it, since I doubt it. I have my reasons why I did my strength ratings like I did them. Share this post Link to post Share on other sites
Shiyonofuji/Kaiopectate 2 Posted January 11, 2006 It's not that difficult. Yes, you're right. Just a hypothesis, a little data collection, some processing and, Presto! OTOH, just following the Tamanoi Blog is a lot more fun: http://www.sumoforum.net/forums/index.php?...=73525&st=175 Share this post Link to post Share on other sites
ikishima 0 Posted January 11, 2006 (edited) Shiyonofuji/Kaiopectate, I am so glad I don't understand what you are talking about. Doitsuyama is correct, blood type is very important in Japan and you have to factor that in. I used to teach in Japan. My students explained to me that Rock, Paper, Scissors is the way that all important decisions are made. I would forget about metrics and simply go that rout next time. Edited January 11, 2006 by ikishima Share this post Link to post Share on other sites
Shiyonofuji/Kaiopectate 2 Posted January 12, 2006 1. I am so glad I don't understand what you are talking about. 2. ...I used to teach in Japan. My students explained to me that Rock, Paper, Scissors is the way that all important decisions are made. I would forget about metrics and simply go that rout next time. 1. Me, too. 2. (Nodding yes...) And what did you teach them? My friend handicaps the horses based upon the fuzziness of their coats. But, this is not about important decisions. If this involved the potential link between the SV40 virus found in some oral polio vaccines and the incidence of mesothelial cancer, then I too would resort to Rock, Paper, Scissors. All I want to know is, will Chiyonishiki Jd24E (2-0) have a career like Chiyotaikai or like Chiyohakuho? (Thinking in depth...) Share this post Link to post Share on other sites
Asashosakari 19,298 Posted January 12, 2006 All I want to know is, will Chiyonishiki Jd24E (2-0) have a career like Chiyotaikai or like Chiyohakuho? (Thinking in depth...) No sophisticated data analysis needed to figure out that somebody who's nearly 24 and has never been above Sandanme 90 probably won't reach Juryo. Although the guy does have one of the coolest career records I've ever seen down there, 70-35-63 as of shonichi. With that kind of win-loss differential guys are normally on the verge of their sekitori debut. But there's the matter of the 63 yasumi and the nine basho spent off-banzuke, of course. Share this post Link to post Share on other sites
Kintamayama 45,086 Posted January 12, 2006 All I want to know is, will Chiyonishiki Jd24E (2-0) have a career like Chiyotaikai or like Chiyohakuho? (Thinking in depth...) No sophisticated data analysis needed to figure out that somebody who's nearly 24 and has never been above Sandanme 90 probably won't reach Juryo. Although the guy does have one of the coolest career records I've ever seen down there, 70-35-63 as of shonichi. With that kind of win-loss differential guys are normally on the verge of their sekitori debut. But there's the matter of the 63 yasumi and the nine basho spent off-banzuke, of course. OTOH, based on what you write, if he managed to stay healthy for more than 12 minutes, he has a chance, no? Share this post Link to post Share on other sites
Doitsuyama 1,185 Posted January 12, 2006 All I want to know is, will Chiyonishiki Jd24E (2-0) have a career like Chiyotaikai or like Chiyohakuho? (Thinking in depth...) No sophisticated data analysis needed to figure out that somebody who's nearly 24 and has never been above Sandanme 90 probably won't reach Juryo. Although the guy does have one of the coolest career records I've ever seen down there, 70-35-63 as of shonichi. With that kind of win-loss differential guys are normally on the verge of their sekitori debut. But there's the matter of the 63 yasumi and the nine basho spent off-banzuke, of course. OTOH, based on what you write, if he managed to stay healthy for more than 12 minutes, he has a chance, no? No. His 70-35 career record (without absences) is so good because he so often dropped to a rank below his abilities. Share this post Link to post Share on other sites
Shiyonofuji/Kaiopectate 2 Posted January 12, 2006 No sophisticated data analysis needed to figure out that somebody who's nearly 24 and has never been above Sandanme 90 probably won't reach Juryo. Although the guy does have one of the coolest career records I've ever seen down there, 70-35-63 Not that anyone has missed my point, but this begins to highlight it better. Age, Won/Loss, Rank. (And injury downtime, although that is in part a co-factor along with age.) No sophisticated data analysis is needed in order to reach broad conclusions. But, what if you'd like to predict something more precise, such as, compare progress of Kokonoe-beya rikishi with those of Azumazeki-beya? And, although no one took Doitsuyama's joke about blood type seriously, (a) sufficient data exists to enable comparison of career trajectories of rikishi from different nations, and (b) as an abstract matter one should not limit one's analysis to study and confirmation of conventional hypotheses. Observing consistent (non-random) results regarding something which should be random could either indicate a deficiency in the study, or it could be a discovery. Having said all that, and as intriguing as this may be to some, including me at times when my sumo-gaming monkey must be talked down from the ledge, the human side of sumo can be much more rewarding, whether that is the intriguing smile by Kyokushuzan as he walks up the hanamichi, or the predictable smile of Yoshikaze as he does the same, or the Christmas lights behind the locker door, and innumerable other things. Share this post Link to post Share on other sites
Asashosakari 19,298 Posted January 12, 2006 (edited) Age, Won/Loss, Rank. (And injury downtime, although that is in part a co-factor along with age.) No sophisticated data analysis is needed in order to reach broad conclusions. But, what if you'd like to predict something more precise, such as, compare progress of Kokonoe-beya rikishi with those of Azumazeki-beya? And, although no one took Doitsuyama's joke about blood type seriously, (a) sufficient data exists to enable comparison of career trajectories of rikishi from different nations, and (b) as an abstract matter one should not limit one's analysis to study and confirmation of conventional hypotheses. Observing consistent (non-random) results regarding something which should be random could either indicate a deficiency in the study, or it could be a discovery. I agree that all this might be incredibly interesting, but somehow I can't shake the feeling that Ozumo simply doesn't have (and can't have) a sufficient statistical basis. What's there to observe objectively besides age, height, weight, stable membership, records and ranks? On the other hand, your typical baseball box score can be mined for dozens of elementary, observable events (and even more second-order events, e.g. batting results with RISP) which can then be processed into such esoteric things as win shares, VORP and whatnot. You're saying that "sufficient data exists" in Ozumo - I'll add my voice to Doitsuyama's invitation for you to show that this is so. I don't see it, I'm afraid. Edited January 12, 2006 by Asashosakari Share this post Link to post Share on other sites
Kintamayama 45,086 Posted January 12, 2006 Age, Won/Loss, Rank. (And injury downtime, although that is in part a co-factor along with age.) No sophisticated data analysis is needed in order to reach broad conclusions.le other things. I personally would like to see a stat for FPT (farts per tachiai). I know for a fact it happens quite a lot, statistics nothwithstanding. Share this post Link to post Share on other sites
Zentoryu 154 Posted January 12, 2006 (edited) I personally would like to see a stat for FPT (farts per tachiai). I know for a fact it happens quite a lot, statistics nothwithstanding. I have to remember to never be in the process of drinking something when reading one of Moti's posts. B-) ;-) Edited January 12, 2006 by Zentoryu Share this post Link to post Share on other sites
Misisko 0 Posted January 12, 2006 have you ever heard something about "neural network" programs??? with right collection of input data it can work!!! (Applauding...) Share this post Link to post Share on other sites
ikishima 0 Posted January 12, 2006 2. (Applauding...) And what did you teach them? My friend handicaps the horses based upon the fuzziness of their coats. But, this is not about important decisions. If this involved the potential link between the SV40 virus found in some oral polio vaccines and the incidence of mesothelial cancer, then I too would resort to Rock, Paper, Scissors. I was on the JET Programme (theoretically) teaching English in a public high school. I am not sure any of it stuck. I have to think that there are too many factors that go into determing who will win any bout. I remember when the Dejima express was running at full speed. Back then there was only one factor, can his opponent live through the first 3 seconds of the match. If so he won. FPT would be much easier to calculate. Share this post Link to post Share on other sites
Bishonohana 0 Posted January 13, 2006 I think I need a drink... (Applauding...) Share this post Link to post Share on other sites
Gusoyama 103 Posted January 13, 2006 I would definitely add length of bout if you could Share this post Link to post Share on other sites
Kintamayama 45,086 Posted January 13, 2006 I would definitely add length of bout if you could Don't you know that length isn't important? Share this post Link to post Share on other sites
Shiyonofuji/Kaiopectate 2 Posted January 14, 2006 You're saying that "sufficient data exists" in Ozumo - I'll add my voice to Doitsuyama's invitation for you to show that this is so. I don't see it, I'm afraid. Maybe Doitsuyama and Moti are not disclosing everything that they know. Moti obviously is holding back fart data. Doitsuyama has correctly predicted over two-thirds of the bouts which he has selected in the "Sumo Game" during the past 5 years, and Moti [Kintamayama] only slightly less than that. See http://www.japanguide.info/sumo/stats.html?2 Plainly each has ordered knowledge which he considers predictive of future success. I'm not sure what the "this" is, which I must now show, as punishment for clumsily making a pretty obvious point. Must I demonstrate that certain factors are demonstrably associated with success? Or must I announce that Kotokuroda will make makuuchi? Does anyone really doubt that materially significant factors exist?, Does anyone claim that you simply can't associate any current information with future success? Anyone who wants to bet against Asashoryu tomorrow? Or, how about a wager that Sawai 31-4 [Mk2] will not do better than Fukukasuga 82-190 [Jk22]? Or that Kusakiyo [74 kg] might have trouble defeating Kainowaka [210.5 kg]? The ISP, for example, offers contestants options in order to make an automated choice: Wins Head-to-head Most Least Weight Heavier Lighter Last Bout Head-to-head Won Lost Rank Higher Lower Wins This Basho Most Least Last Bout Won Lost Height Taller Shorter Age Older Younger Side East West http://isp.sumogames.com/[must register, log in and follow link] The ISP's webmaster may even know which are the best predictors, so I don't need to have all of the answers myself in order to defend a proposition that answers are out there waiting to be figured out. Sure, I could use more data, but I could use better statistical/spreadsheeting skills even more. Those who doubt that there's "sufficient data" available should be introduced to the others who say that there are too many factors to consider. I think there's plenty of data out there to compile, and plenty of data which people haven't processed. Is this really not being done in Japan?! I can't believe that...is there any gambling over sumo? No handicapper would run a book without resort to some predictive method. There are many aspects of sumo -- at least as many as baseball -- which are enjoyable. Not too much different than the many features of sumoforum.net: aspects of SABR, which I mentioned previously, are devoted to science and statistics, while others focus upon other worthy topics. See http://www.sabr.org/sabr.cfm?a=com&m=5 Sumo statistical analysis should go public. So many computer geek males, so many statistic-laden games. Consider Masumi Abe's "Quality Index" http://www.scgroup.com/sumo/Hatsu06/QualityI.html Consider John Jermanis's "Power Rankings" http://www.banzuke.com/99-4/msg00163.html (and see my comment 6 years ago: http://www.banzuke.com/99-4/msg00040.html). What other formulas are people using? Share this post Link to post Share on other sites
Asashosakari 19,298 Posted January 14, 2006 Doitsuyama has correctly predicted over two-thirds of the bouts which he has selected in the "Sumo Game" during the past 5 years, and Moti [Kintamayama] only slightly less than that. See http://www.japanguide.info/sumo/stats.html?2 As have lots of other players on that list, most of which I'd suspect have never used any kind of sophisticated statistics to make their picks. (Actually, I'm pretty sure Moti hasn't, either.) I'll admit I'm only vaguely familiar with Sumo Game, but you really can't do worse statistically than 5 out of 10 on average even if you pick by throwing darts at the torikumi. Also, keep in mind that, say, 7 out of 10 means the player picked 7 out of a field of about 21 winners correctly. Given that there are always safer picks and less-safe picks on a day, I think you're overestimating what two-thirds average score really means. You might be interested in this post about Sekitoto (which handicaps the players' performance a lot more than Sumo Game does, by making you pick every single bout) from about a year ago. Note that even the very best players are beating random chance by less than 10 percentage points. In short, it's difficult to exactly predict success in sumo, even on a bout-by-bout basis. And even moreso across an entire career. I could probably come up with a dozen guys from the last couple of years who looked like at least Juryo-caliber rikishi at one point, only to have an injury derail everything. Injuries play a huge role in Ozumo success, and in my opinion they play a bigger role than many (perhaps all) tangible factors you may be able to isolate. Plainly each has ordered knowledge which he considers predictive of future success. For very short values of "future" in this case, namely one day. Does anyone really doubt that materially significant factors exist? Did anybody say that they don't? I don't think so. Notwithstanding what others' opinion may be, mine is that you simply won't be able to isolate any materially significant factors in a statistically meaningful way. Sawai was a high school yokozuna - that's certainly materially significant, but it's also something that nobody needs a special formula for in order to factor it into their expectations of his future success. And even then, Sawai's just one injury away from not making it to Juryo. Just ask a few of the recent Makushita tsukedashi starters (tsukedashi qualification being perhaps the best indicator of future success, as it's so hard to get now) - Takamifuji, Nakano, Hakiai, Asahimaru, all massively derailed by injuries already. The crop of Ms15 tsukedashi rikishi who have made it to Juryo and above is exactly the same size: Kakizoe, Futeno, Takekaze, Kambayashi. Right now it looks like only Takamifuji will join them anytime soon, if ever. So that's a 62.5% likelihood of making it to Juryo for this limited sample of rikishi - and that's a group with quite possibly the most prestigious qualification available. Good luck isolating factors that are more arcane than that. Does anyone claim that you simply can't associate any current information with future success? Anyone who wants to bet against Asashoryu tomorrow? Or, how about a wager that Sawai 31-4 [Mk2] will not do better than Fukukasuga 82-190 [Jk22]? Or that Kusakiyo [74 kg] might have trouble defeating Kainowaka [210.5 kg]? Again, nobody needs to make statistical inferences in order to judge such trivial cases. You're building an army of strawmen (straw-rikishi?) here. The ISP's webmaster may even know which are the best predictors, so I don't need to have all of the answers myself in order to defend a proposition that answers are out there waiting to be figured out. Sure, I could use more data, but I could use better statistical/spreadsheeting skills even more. To speak for myself, my intuition (based on following a boatload of rikishi in the lower divisions, and frequently being flabbergasted by their career trajectories*) is that any such effort will be akin to chasing ghosts, and the opportunity cost of my time is a little too high to commit myself to such an activity. * Note what Doitsuyama wrote about Dairaido in the foreword to this daily results posting the other day. I'd noticed the same thing before the basho, and you know what? If somebody had told me before Hatsu 2005 (when Dairaido was Ms22w) that he'd be knocking on the door to Makuuchi just 12 months later, I'd probably considered them borderline-nuts. There simply wasn't anything in his recent results prior to the last year that would have pointed to such a stunning improvement. Was Dairaido on the "probably capable of becoming a sekitori sometime" list? Sure, but so are dozens of others in Makushita and below, and most of them will never make it. Those who doubt that there's "sufficient data" available should be introduced to the others who say that there are too many factors to consider. Well, thanks for introducing me to myself, then. My opinion is that there are too many factors to consider, almost none of which allow for the collection of sufficient data for a meaningful statistical analysis. I think there's plenty of data out there to compile, and plenty of data which people haven't processed. You keep saying this, yet you can't seem to come up with any evidence to support your belief. Sumo statistical analysis should go public. So many computer geek males, so many statistic-laden games. Consider Masumi Abe's "Quality Index" http://www.scgroup.com/sumo/Hatsu06/QualityI.html I'm sorry, but I consider the Quality Index completely useless. Nice little idea, just no meaningful output. Not that it matters - The Quality Index is merely a translation of a rikishi's basho record with some additional data figured in. I fail to see why you're bringing it up in the context of predictions, since it does no such thing. Consider John Jermanis's "Power Rankings" http://www.banzuke.com/99-4/msg00163.html (and see my comment 6 years ago: http://www.banzuke.com/99-4/msg00040.html). Looks somewhat similar in idea to Doitsuyama's ratings. The point you continue to miss is that these things don't allow for any meaningful long-term forecasting. Their existence doesn't prove the feasibility of your ideas in any way. What other formulas are people using? For pre-basho games, I generally look over each rikishi's semi-recent tournament records, try to remove outliers from consideration (injury basho, etc.), then make my picks. I've occasionally peeked at Doitsuyama's ratings, but I find that they're confirming my gut feeling 90+% of the time, so it's usually not worth the effort. For dailies, I do much the same thing, just with head-to-head results, and factoring in the events of the current basho. I don't think I've ever used any kind of statistical formula to make game picks, and I'm certainly one of the more stats-inclined people on here. It's just not worth the investment of time and effort, in my opinion. Maybe I'm wrong - but the onus is on you to demonstrate that. Talk is cheap, especially if you only end up quasi-religiously repeating the same assertions over and over. You seem to assume implicitly that I'm rejecting your claims because I haven't thought about the subject enough. I can assure you that's not the case, not by a longshot. On the contrary, I'll assert that it's you who hasn't pondered the subject nearly enough. You're in love with a Big Idea, and those pesky details be damned. Share this post Link to post Share on other sites
Jejima 1,384 Posted January 14, 2006 Kabochajima has an interesting method for picking her bench sumo team - and has been doing better than me recently (when I actually spend some time working out who will do well in the basho, and then adding Kaio to the mix). What she does is first to pick all the available 'ho's' (Her surname is 'Ho'), and then fill in the rest of her team with rikishi that she remembers from our trip to Japan. Not too bad a technique, as she currently has Roho, Hokutoriki and Hakuho in her squad ;-) Share this post Link to post Share on other sites
Kintamayama 45,086 Posted January 14, 2006 (edited) Maybe Doitsuyama and Moti are not disclosing everything that they know. Moti obviously is holding back fart data.Doitsuyama has correctly predicted over two-thirds of the bouts which he has selected in the "Sumo Game" during the past 5 years, and Moti [Kintamayama] only slightly less than that. I have no idea what a database looks like. Doitsuyama and I are bitter arch-rivals in all games, yet use totally different systems. Doits uses science + gut-feeling, I use instinct. Sometimes it works, sometimes it doesn't. I admit that I know the rikishi and do take in account what their past matchups were, as these stats are sometimes out there with the entry form. But, in the long run, Doits and I are usually around the same results during bashos, though he does tend to beat me in head to heads. I just do my entries real quick so I can go about my business of doing nothing, or updating my sites manually, (which, I do admit, enhances my knowledge of all foreigners, Makushita and Old-timers..). My "system" notwithstanding, to each his own. It's a free country. Personally, if I were bound by databases and spreadsheets, I would quickly get bored senseless and actually try to find a real job. Edited January 14, 2006 by Kintamayama Share this post Link to post Share on other sites
Naganoyama 5,905 Posted January 14, 2006 I use mostly gut feel, a small amount of examination of head-to-head results and recent basho performance and then sit back and think 'When this bout happens tomorrow, will I feel happy cheering on this rikishi?' - if the answer is no, then I reverse my choice. (Perhaps this is the same reasoning which makes Aderechelsea pick Kaiho a lot). I don't think I would perform much better if I used a smarter technique, but I would probably enjoy the sumo less. I imagine that if you accumulated each rikishi's scores based on my daily picks you would see a lot of strange people regularly getting double digit wins. Share this post Link to post Share on other sites
Kotoseiya Yuichi 3 Posted January 15, 2006 I select the winning rikishi according to which one of my nuts happens to itch more at the time of selection. Left for higashi, right for nishi. Hey, so far it's been better than 50 %... Share this post Link to post Share on other sites