https://www.reddit.com/r/LocalLLaMA/comments/1mvnmjo/my_llm_...
I doubt a better one would cost $200,000,000.
The article makes a comparison to financial backtesting. If you form a dataset of historical prices of stocks which are _currently_ in the S&P500, even if you only use price data before time t, models trained against your data will expect that prices go up and companies never die, because they've only seen the price history of successful firms.
For example: Chernobyl, COVID, 2008 financial crisis and even 9/11
If you had a financial model that somehow predicted everything but black swan events, that would still be enough to make yourself rich beyond belief.
You give an LLM all the information from right before a topic was discovered or invented, and then you see if it can independently generate the new knowledge or not.
It would be hard to know for sure if a discovery was genuine or accidentally included in the training data though.
Any thoughts how one could get started with this?
If I want an unbiased reason for what happened before a war started i would want all the information about 2 countries at different points before the war. Because after a military war starts an INFORMATION war also starts. Propaganda will be spread from both sides as wars are just as much about global support as they are about military dominance.
>Overspecialization of models, often referred to as overfitting in machine learning, is a condition where a model learns the details and noise in the training data so well that it negatively impacts its performance on new, unseen data. This prevents the model from being able to generalize its knowledge effectively.
It is crucial to have a good framework in how you ask your questions though to avoid bias when using these systems and to try and focus on raw facts. To test ideas I like to make it fight for both opposite extreme sides of an argument then I can make up my own mind.
I feel like that generation of models was around the point where we were getting pleasantly surprised by the behaviors of models. (I think people were having fun translating things into sonnets back then?)
I imagine that (the author hints at this), to do this rigorously, spelling out assumptions etc, you’d have to build off theoretical frameworks used to inductively synthesize/qualify interviews and texts, currently around in history and the social sciences.