12 Matching Annotations
  1. Apr 2023
    1. Training will continue to be expensive and require supercomputing-level resources — it costs many millions to train these models, because they consume huge amounts of data, and the training is very computationally intense and must be repeated many times to tune the weights of the model. There will be efficiency improvements, but training costs of the most powerful models will rise as training datasets expand and model architectures grow.

      This is true for the state of the art, most general models. But, I strongly believe there will be models that are nearly as good that don't require so much expense. I bet that Bloomberg didn't spend that much money. https://arxiv.org/abs/2303.17564. The llama stuff from facebook has smaller models they say that are competitive with GPT3. see my Ben's thoughts also. https://benjaminfspector.com/ae.html I think his title is not exactly what his article says, btw.

    2. Tools for ML workers

      I don't know how to make a big list of all the applicability, but I think it's greater.

      I think customer service is very likely to be a place where it will work very well. Call center automation. We'll train on more specific data, do lots of one-shot training, and perhaps reinfrocement learning from all the users. Perhaps, we'll have a prominent big red button that users can push and immediately report an issue and get an instant human. Perhaps, also, an audit trail that is reviewed very efficiently by another system looking for issues.

      I bet there are countless other exmamples of use. * haven't delved into the Bloomberg GPT stuff, but it must be useful.

      I see. Your list of "some examples" above are indented under software development and I don't think they should be.

      Let me know when you've finished this. If you sort-of publish it, I'll use it somewhere!

      Thanks for writing this. It's close. Of course, you'll need to continue to update it once a month given the rate of change

    3. This particluar jailbreak no longer works, but it demonstrates what researchers are doing with prompt engineering, and perhaps what future blackhat researchers might do to attack models

      do you have any of the output from this jailbroken prompt that you could show.

    4. Tools for "experts"

      You might want to emphasize that any tool where failures are tolerable is game for these things. (Of course, you know my framework.) So, providing high quality, but imperfect, input to experts is a likely place. Experts, in theory, can check the quality and fix the modest number of hallucinations, etc.

      I think this framing is slightly more general than yours... But, you have it essentially as right.

    5. There are a growing number of examples of third parties writing optimized versions of a model someone else developed2

      right -- this may mean you soften the 2nd paragraph o fthis ection.

    6. Garbage in, garbage out: training data that’s incorrect, biased, etc, will produce a model that makes mistakes, has biases, etc.

      I might add a 5th point. While there is every likelihood that these LLMs will greatly improve and be very useful, there is a significant difference of expert opinion as to how much that current approaches to foundation models will resolve these problems.

      I might also add a 2a. point They do not at present provide references for the data from which their statements are derived, nor do they typically provide references.

    7. Apparently a temperature of .8 seems to produce the best essay results

      any intuition you can provide on what the temperature does. E.g., temp = 1 perhaps means you ony slect the most linkely word, and temp = .8 means ..