6 Matching Annotations
- Aug 2022
-
deephaven.io deephaven.io
-
Preview: quick before and after
Check out the
Preview
section to see how much better the blog post images are when generated by DALL·E 2 for $45
-
- Mar 2020
-
victorzhou.com victorzhou.com
-
Here’s a very simple example of how a VQA system might answer the question “what color is the triangle?”
- Look for shapes and colours using CNN.
- Understand the question type with NLP.
- Determine strength for each possible answer.
- Convert each answer strength to % probability
-
Visual Question Answering (VQA): answering open-ended questions about images. VQA is interesting because it requires combining visual and language understanding.
Visual Question Answering (VQA) = visual + language understanding
-
Most VQA models would use some kind of Recurrent Neural Network (RNN) to process the question input
- Most VQA will use RNN to process the question input
- Easier VQA datasets shall be fine with using BOW to transport vector input to a standard (feedforward) NN
-
The standard approach to performing VQA looks something like this: Process the image. Process the question. Combine features from steps 1/2. Assign probabilities to each possible answer.
Approach to handle VQA problems:
Tags
Annotators
URL
-
- Feb 2020
-
github.com github.com
-
Of the three primary color channels, red, green and blue, green contributes the most to luminosity.
Green colour vs red and blue (RGB)
-