, forcing explicit step-by-step chains to make prediction more reliable?
is it preferred to always promp LLM to use a specefic structure to be more accurate?
, forcing explicit step-by-step chains to make prediction more reliable?
is it preferred to always promp LLM to use a specefic structure to be more accurate?
confidently wrong in a way
Yes. also, when it generates inefficient response answers to math, its still confident. and that often leads me me to doubt my own intuition.
The LLM predicts continuations that match those high-quality human patterns.
but spotting errors in proofs? how can just predicitng patterns do that?
predicting sequences that humans label as correct solutions.
Does that mean if there's a very new question, then the model would fail to solv? but the truth is LLMs like Gemini deepthink are still surpassing PhDs in solving them
advanced models produce remarkably sophisticated outputs. How might that emerge purely from prediction?
The language seems to be sophisticated to humans, because the models have been trained and post-trained and tuned towards outputs which seem readable and actually more pleasing to humans. But I'm still unsure how in math, or debugging, do they generate correct, useful outputs?
—how might that guide your own use?
BUt I am confused. I have used LLMs to extract insights/inaccuracies (in math), generate novel brainstorming questions, reviews,even generate prompts (prompt engineering). it doesnt seem they were just predicitng next word
general token predictors
how do they solvemath then? especially yhe very advanced models now
Why couldn't we just build perfect detectors?
LLMs are inaccurate. even if they ar accurate most of the places in the response individually, they might be wrong as a whole
What happens if you prompt an LLM with "What do you think about [topic]?" versus simulating a discussion among diverse experts?
When we assign a generic persona, "you", to an LLM, it just randomly picks one out of a thousand or so persons it can simulate, and gives the answer. We never know how relevant or accurate the persona would be to give advice to our question.