4 Comments
User's avatar
Stephanie Losi's avatar

Wow, otherizing the outliers and just tossing them? That's like looking at market performance by year and saying, "Welp, we'll just delete the data points for 1931, 1937, and 2008, those can't be important. Must be an error."

I don't think a finance modeler would do that, but in other situations/scenarios outliers *do* get tossed. What do you think the difference is?

Expand full comment
Harry Crane's avatar

I think plenty of finance modelers do exactly that. It's a delicate situation because including huge outliers in the data can mess up the model for all of the data --- when the model is misspecified (which it almost always is). But removing them and forgetting about them will understate uncertainty and overstate confidence that a big outlier will never happen again.

Expand full comment
Stephanie Losi's avatar

It seems bizarre to me that, even knowing the context of those years being important, the data points could get tossed, rather than adjusting the model spec or at least being able to toggle those data points in and out.

Expand full comment
Harry Crane's avatar

Easier said than done. The limits are mostly technical, not conceptual. Conceptually, it's easy to think of how you'd model those anomalous situations in theory. In practice, there is a much smaller subset of models that are computationally feasible for applications. The subset of such models grows smaller as the complexity and size of the application increases.

Expand full comment