while the validating loss soars.
good
while the validating loss soars.
good
however, these fluctuations are more severe for the validating data, reflecting what was shown in the last five steps
very good
Additionally, adding more convolutional layers had the opposite effect, increasing both the accuracy of the model and the training time
good
As you can see, the convolutions are trying to pull out the different features of the nine, notably the tail and the circle
good
The tiny and small models seemed to do fairly well, with the tiny model producing the smallest discrepancy
I agree. Do you think the small model outperforms the medium sized one?
the model is very overfit and, therefore, won’t handle potential new data well
very good
so most of the important features remain, but inimportant information is removed
very good
the images are filtered so a distinct feature is highlighted
very good
overfitting the model
good
helps speed
good
Mount Rushmore, but with my faces
Nicely done! I'm certain your submittal will gain the attention of the review committee!
CNN performed better.
good
I “see” my “dreams” coming true
Remarkable submittal! I expect you will receive some votes from the review committee. Good job!
why it was important to include a callback in the model
good
Secretariat With Rider
nice tests
More specifically, the first row depicts the 7 stored at index 0, the second row depicts the 7 stored at index 17, and the 7 stored at index 26 of the testing images.
excellent investigative design
the convolutional neural network (CNN) predictions had a slightly higher accuracy value of approximately 0.99
good
Convolving filters are very useful for computer vision because they process and extract the important features from the image
very good
Interestingly, this filter also produced wavy lines across the sky
that is really interesting, as that underlying structure was not immediately apparent to me upon first impression
As such, the outer edges of the image are not processed because there must be 8 directly adjacent pixels to the current pixel.
excellent interpretation of the process
Art. Enough said.
Nicely done! I'm sure the review committee will be very interested to review your submittal!
the large and medium model became highly overfit
good
falls then rises and falls again.
good
I like how the hand came out through it. I’d probably given time filter the contrast of the image so that Earth would be broken up into pieces more like the arm.
Nicely done! I'm sure the review committee will be very interested in your submittal.
but it’s evaluation binary cross entropy skyrocketing
good
More epochs won’t significantly improve performance. They might in fact reduce it’s performance on a test set by overfitting it.
good
I can definitively say while faster, the Deep Neural Network was not as effective at predicting on this dataset. The Conv2D and MaxPooling2D layers improved the model greatly.
good
This method speeds up model predictions, as there is less data to work on, without losing too much information
good
Each set of four dots is searched for the brightest dot, and that dots value is used as the new value
good
never able to comprehend but captivated by its beauty nonetheless.
How interesting! I'm certain the review committee will be intrigued by your submittal. I find it captivating myself!
In this case, comparing the 4 models demonstrates the concept: a larger model has more power to generalize more complex amounts of data, but overfits easily if the model is overqualified to generalize the data set.
excellent
Adding more convolutions would increase the time it took to train the data even more with minimal returns on the increase to accuracy since the accuracy is already at 99.4%
good
scipy librar
yes!
After Kandinsky Filter
This is sublime. I'm sure you will receive some votes from the review committee!
Below, is my image for the Jump Start Data Science T-shirt competition, and also my assignment for Project 2!
Nice artwork! I'm certain the review committee will be very interested to consider your submittal.
The diagonal axis is just the distribution of that variable
What is the relevance of the area under the curve?
The ImageDataGenerator() command is essentially reading the pictures from our source folder.
good
I think it’s safe to say that the convolutions improved the performance of my model.
very good
Shirt Images
Superb! I expect you will receive some votes from the review committee. Job well done!
Chemical Orbitals Styled London
this one is really interesting
I am sorry that you know do
Goya lived through some times of trauma in Spain!
Reversing the Content and Style Images
spooky!
As the number of layers increases in our models, there is a quicker and quicker drop off in the loss
good
With convolutional and pooling layers, it appears that more and more of the original image is reduced to these basic outlines of features. That is, there appears to be more information lost, but less noise.
excellent!
no information appears to have been lost, even though the picture size has been halved from the original 1022x767 to 511x383
very good
misc.face
you found the other .misc image in scipy
This stylized image is supposed to be what our vision looks like after sitting at the computer for hours after our bedtime due to various issues with our coding
Very interesting! A starry inside night -- great work, I'm sure the review committee will be excited to review your submittal!
it seems like the model has been overfit at this point, as the values appear to be stagnant or even increasing
good
This provides a useful tool for investigating the co-relationship amongst the variables, as we can see how they interact independently from all of the other variables
good
the letters to take the color scheme of the microchip that happened to also be W&M colors
What a novel idea! A great look for the review committee to consider during its selection process.
It was revealed that the tiny model avoids being overfit and the larger models are indeed overfit.
very good
So, the model would work better if we decreased the epochs
good
because it is smaller, so it takes up less space and the image itself is maintainend so you don’t lose any important features
very good
thinking about what is going on the background is that the pixels are being manipulated in a way that it uses nearby pixels to guess and change so that the result comes out different
nice interpretation
This summer was very busy and I thought that choosing this image would remind me to relax and take a step back to experience nature and all of the beautiful things that surround me
Arnold is a very serene robot! I'm sure the review committee will be excited to consider your entry. Good job!
I noticed that the tiny one performed the best
good
It seems that by adding the Conv2D and MaxPooling2D layers, the accuracy of the neural network improved.
good
Times of Passion
Wow! I expect your work will receive some votes from the review committee. Great job
allows us to recognize whether we are overfitting or underfitting
good
but has maintained all the features of the original image
good
we could classify an image based on its features
good
My Design
Wow! I'm expect that you will collect some votes from the committee for your excellent design.
it’s almost like a highlighter
nice analogy!
The images the model trained and tested on are imperfect in comparison to the fashion mnist and numbers mnist datasets so the model is fairly inacturate. We can also tell that the model is overfit because it reached a training score of 100% while the validation score remained at a low ~75%.
good
additional convolutions may begin to overfit the model
good
My favorite image is the following
Wow! I have a feeling your art work will receive some positive votes from the committee!
comparison graphs can confirm that a smaller model will not underfit/overfit the model in this case
very good
skewed right
good
Thus, the model must be either under or overfitted
very good - perhaps the source data itself is also at issue
categorical labels are necessary
How about discrete values that have ranks?
I do have questions concerning this method; rescaling an image that potentially does not have the same aspect ratio could distort the image and thus the accuracy
Perhaps you could suggest a model based approach that would be more effective?
overfitting
very good
Thus, although it was better at 64 filters, too many filters can overfit the data
excellent
Although the filter applied was meant to detect edges, our values may have overblown the exposure, thus wiping out a lot of the edges. This filter would probably not be very effective for edge detection.
I'm wondering, if you also applied a weight?
we notice that convolutions are performing linear transformations on the image
excellent
In other words, we are filtering specific characteristics in order to look at the data in the image that is necessary
excellent description
since the model represents the training data better
nice plots, good work!
overfit
good
how
Good - guess/probability
The 4 bedroom house is the best deal and the 3 bedroom house is the worse deal. In my model the predicted value was subtracted with the actual price and the lowest number (most negative tells me that that house had the best deal, which was the 4 bedroom house) and vice versa for the 3 bedroom house.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
Traditional Programming: you have rules and data come in and answers that come out. It is manual and hardcoded input of rules. Machine Learning: you have answers and data aka labeling come in and rules that match one to the other come out. Algorithm is used so it is not manual.
OK. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
where the training accuracy still continues to increase.
Any ideas as to why it continues to do so? Good work!
Produce the following plot for your randomly selected image from the test dataset
super! nice plot
As Maroney explains, “if you show them all the shoes, then there’s no point. You’d have to show them some shoes, then let them train by identifying and picking out new things that they’ve never seen before.”
good
On the other hand, the worst deal is Moon (2 bd 250k) because the predicted price is two-thirds of the actual price, so the actual price would be “way too high” and thus “not a good deal”!
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
fit price
very good
No, because the model takes steps towards the right answer in different sizes. The epochs (number of loops/iterations) is also limited to 500 so it will not always be the same as it doesn’t always run until 100% completion, but will get pretty close.
OK, but do you think probability has anything to do with the two, almost the same, yet still different results?
Simplified, it’s that traditional programming is rules + data –> answers, vs machine learning is answers + data –> rules.
Very good. Could you also elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
increased
Would you describe these results as good? Are you suspicious of any potential problems inherit to the model itself?
(60000, 28, 28) 60000 (10000, 28, 28)
Good, nice plot
The Hudgins house was the best deal based on bedrooms because my program says the 3 bedroom house should be worth 245k, the worst house is the moon house because the price of the house is 250k my model said it should be 123k.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
guess
good - probability
Machine Learning is taking answers and data and trying to find rules, while traditional programming is trying to take rules and data to get answers
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
I think it is safe to say from this exercise that the amount of rooms in a house cannot be the only determining factor for house price.
Very good - an ordered table would also help to support your argument
probabilities
exactly
The machine (watch) uses data and pre-supplied information (answers) in order to determine when the user is performing a certain task or activity (rules). In a traditional sense, the machine should have been given the rules in order to determine the answers but in a situation like this one, that order to operation does not make sense given how user specific the data is.
Nice thoughtful and comprehensive response.
By training the model on more people, it is possible for the model to identify a wider range of individauls and more accuratley distinguish social distancing violations.
Good work!
However, if the camera angle is closer to 90 defgrees, the results can be far more accurate, making the detecor a good choice
Good observation.
When the line peaks, this means that the accruacy is no longer increasing, and the model is beginning to become overfit
Excellent! Nice plots!
while the rest of the data is used to see how well the model can predict a value
good
It seems that the worst deal was the Church house (with 4 bedrooms costing $399,000).
Good! Adding a table would be helpful here
Because of this, it is possible for the values to be different depending on how much error the loss deems from the guess.
Yes, due to the guess itself (probability)
This is different from traditional programming, in that with machine learning, the computer/program will be able to define the rules as the output, while traditional programming requires the user to enter the rules as an input.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
This value could be an indication that we overpaid for our home.
good - using a table here could be helpful
For example, if we were to input customer demographics and transactions, and then have historical customer churning rates as an output. Using these two characteristics, the algorithm will then create the program and will give you predictions (in this case) based on the data you provided.
Good. I like the applicable example.
Based on my model, the house at Holly Point Rd. presents the best deal as you would be spending $134,365 less than what the model predicted. Meanwhile, the house in Church St. would present the worst deal as you would be paying $98,088 more than the model predicted.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
These are different answers because the model is recompiled and relearning the data set and there was no random seed set.
excellent! the perfect computer scientist answer
Meanwhile, machine learning involves inputting the answers into a machine and having it figure out the rules for the programmer.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
It’s an upward trend as people are getting more reliant on the use of technology.
Perhaps it also is related to the fact that the rate of human population growth is increasing, as well as the complexity associated with larger numbers of people.
What is the shape of the images training set (how many and the dimension of each)? 28x28 & 60,000 What is the length of the labels training set? 60,000 What is the shape of the images test set? 10,000
perfect!
By knowing how accurate the model is, we know how accurate it may perform with new data
very good
By making a neural net model, we can estimate what each house should be priced based on bedroom number. Looking at the output of the code, we see that 160 Holly Point Rd, a house with three bedrooms selling for $97,000, is the best deal. Based on the model, we see that 760 New Point Comfort Hwy, a five bedroom house selling for $577,200, is the worst deal.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
This model could have been strengthened by using square footage instead of bedroom count in the input. Thus, it would account for other spaces and bathroom count.
Yes very good, adding an additional predictive variable would likely improve the model. What about more observations?
Because it is a stochastic process, the answers will be very close but not often the same
Very good -- how often do you think it will in fact be the same?
probabilities
good
Traditional Programming: One inputs rules and data in order to derive answers Machine Learning: One inputs data and answers in order to derive rules
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
In this graph, the training accuracy increases with each epoch. Validation accuracy peaks again, indicating overfitting.
nice graphs!
the model is training too well on the train data
good
overfitness
good
(wrote the code)
Could you have posted the image here?
The optomizer and loss function are used when compiling the model and the ones used, “adam” and “sparse_categorical_crossentropy” are useful when classifying multiple categories.
Could you have provided a bit more explanation of how these two functions serve to improve the prediction from the neural net model?
The Hudgins house presents a great deal, because my neural network estimated that the price of a 3 bedroom house should be around $233,000, but the house is only priced at $97,000. One of the worst deals is the church house, because the network determined that a 4 bedroom house should be priced at around $300,000, but the house is isntead priced at $399,000.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
probability
good
In traditional programming, the programmer will input data along with rules. From the combination of the two, the model will predict the answers. However, machine learning is a reorientation, as the programmer inputs data and answers, and the model instead figures out the rules. For example, it can figure out the relationship and other rules between variables.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of discipline
overfitting
good
his wasn’t a question and, consequently, I’m not sure how to answer this, so here’s my code if that helps:
callbacks = myCallback()
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
predictions = probability_model.predict(x_test)
predictions[1000]
np.argmax(predictions[1000])
y_test[1000]
def plot_image(i, predictions_array, true_label, img): predictions_array, true_label, img = predictions_array, true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([])
plt.imshow(img, cmap=plt.cm.binary)
predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red'
plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color)
def plot_value_array(i, predictions_array, true_label): predictions_array, true_label = predictions_array, true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array)
thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue')
i = 1000 plt.figure(figsize=(6,3)) plt.subplot(1,2,1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(1,2,2) plot_value_array(i, predictions[i], y_test) plt.show()
num_rows = 5 num_cols = 3 num_images = num_rowsnum_cols plt.figure(figsize=(22num_cols, 2num_rows)) for i in range(num_images): plt.subplot(num_rows, 2num_cols, 2i+1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(num_rows, 2num_cols, 2i+2) plot_value_array(i, predictions[i], y_test) plt.tight_layout() plt.show()
These probabilities add to 1
excellent
logits
good
According to the model (trained on the new house data), the Church St home is the most overvalued (and is therefore the worst deal) at 99k over model price (300k). The best buy would be Holly Point which is undervalued at 138k below model price (235k)
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
which is due to the stochastic or random nature of neural networks
very good
Machine learning, however, takes answers and data as an input and the model creates (or, perhaps more appropriately, guesses) the rules.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines
There is not a large impact. The loss is 0.2775 and takes 45s to train. The accuracy is better by about .025 and takes about 15s longer compared to a single Dense layer of 512
Good assessment of the computational expense associated with this added line of code.
There are 10 neurons to match the 10 expected outputs for the network. I get a InvalidArgumentError when I try training the network with 5 instead of 10 neurons in my last Dense laye
Good
I get a ShapeError. This is because our data is currently 28x28 pixel images, and we cannot have a 2D network. We need to flatten the 2D array into a 784 1D array for the model to work
OK, good. I see these are the questions from the notebook. Thank you for providing these answers!
I know this because 2. The 10th element on the list is the biggest, and the ankle boot is labeled 9. It should be noted that the 10th element is actually the digit 9 which represents the ankle boot since the neurons are numbered 0-9 for a length of 10
OK, I see - think the class structure from the fashion_MNIST dataset was still present when running the code on the mnist dataset (letters)
The home that costs $97,000 with 3 bd and 1 ba is the best deal since it has the greatest difference from the predicted price using the model that was fit to the 6 homes. The home that costs $577,200 with 5 bd and 2 ba is the worst deal since it has the greatest difference from the predicted price using the model. I fit a model on the given home prices and then compared each home’s bedroom to the original given model of 50 + 50x where x is the number of bedrooms. Then the model predicted the price for each bedroom. Then this predicted price was subtracted from the original given model with the same number of bedrooms. The result with the highest and lowest price are the worst and best houses respectively
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
where the computer guesses and how accurate its first guess is
Very good
For machine learning, instead of the programmer writing the rules, the machine will look at both the input and the output and find the rules that govern them
Could you have further elaborated on this explanation. What makes this new design possible for machines to "learn"?
I.D. 1) 60000 images 28 by 28 pixels 2) 60000 3) 10000 images 28 by 28 pixels 4) array output: [[9.6861429e-11 4.3787887e-07 9.9999952e-01 1.0604523e-09 1.2526731e-16 3.6759984e-10 2.3595672e-11 1.7706627e-14 8.6952684e-10 1.8218145e-17]]
callbacks = myCallback()
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
predictions = probability_model.predict(x_test)
predictions[1000]
np.argmax(predictions[1000])
y_test[1000]
def plot_image(i, predictions_array, true_label, img): predictions_array, true_label, img = predictions_array, true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([])
plt.imshow(img, cmap=plt.cm.binary)
predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red'
plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color)
def plot_value_array(i, predictions_array, true_label): predictions_array, true_label = predictions_array, true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array)
thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue')
i = 1000 plt.figure(figsize=(6,3)) plt.subplot(1,2,1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(1,2,2) plot_value_array(i, predictions[i], y_test) plt.show()
num_rows = 5 num_cols = 3 num_images = num_rowsnum_cols plt.figure(figsize=(22num_cols, 2num_rows)) for i in range(num_images): plt.subplot(num_rows, 2num_cols, 2i+1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(num_rows, 2num_cols, 2i+2) plot_value_array(i, predictions[i], y_test) plt.tight_layout() plt.show()
I have attached a plot depicting the first test image alongside it’s probability distribution, below. Interestingly, the model was so confident in its prediction that the first test image depicted a 7, that the remaining probabilities don’t even appear on the graph.
How interesting! I am greatly appreciating your thoughtful and thoroughly comprehensive responses. Please keep up the fantastic work!
Doing so helps keep future positive neuron outputs from being cancelled out by any previous negative neuron outputs
Excellent - and therefor improving the models potential predictive power
In other words, “relu” removes negative outputs
I like the use of "in other words"
However, the model then can’t be tested for accuracy on the same training set because it already knows what those images are supposed to depict.
excellent
As a result, I determined that the Hudgins house has the best value, at a price of $97,000, because it costs approximately $137,567 less than houses with three bedrooms are predicted to cost. The Hudgins house is thus the most undervalued. On the other hand, the Church house has the worst value, at a price of $399,000, because it costs approximately $99,193 more than houses with four bedroom are predicted to cost. The Church house is thus the most overvalued.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
Consequently, the machine may not predict the exact answer of 30 when the input is 7, 100% of the time.
Exceptional
probability
perfect
100X
Excellent! You have already identified the relevance of scale!
In traditional programming, programmers use rules and data to produce answers. However, machine learning almost reverses this process, as the programmer must know what answer they’re looking to receive and provide the data necessary to reach this answer. Then, the machine/computer will generate the rules necessary to reach that answer. In short, traditional programming yields answers based on rules and data, while machine learning yields rules based on answers and data.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
After a while, the validation accuracy actually starts to decrease, as the model is getting overfit
Good!
Loss is the ‘inaccuracy’ of the prediction, so minimizing loss increases the accuracy of the model
excellent
D. Using the mnist drawings dataset (the dataset with the hand written numbers with corresponding labels) answer the following questions. 60000, 28, 28 60000 10000, 28, 28 ```python import tensorflow as tf import numpy as np class Callback(tf.keras.callbacks.Callback): def on_epoch_end(self,epoch,logs={}): if(logs.get(‘accuracy’)>0.99): print(‘\nReached 99% accuracy so cancelling training!’) self.model.stop_training = True mnist = tf.keras.datasets.mnist (x_train, y_train),(x_test, y_test) = mnist.load_data() x_train, x_test = x_train/255.0, x_test/255.0 callbacks = Callback() model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28,28)), tf.keras.layers.Dense(512,activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ]) model.compile(optimizer=’adam’, loss=’sparse_categorical_crossentropy’, metrics=[‘accuracy’]) model.fit(x_train,y_train,epochs=10,callbacks=[callbacks]) probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()]) predictions = probability_model.predict(x_test) random_image = np.random.randint(0,len(x_test)) print(‘Random Image Number:’,random_image) print(predictions[random_image]) 5. ```python print(np.argmax(predictions[random_image])) ```python def plot_value_array(i, predictions_array, true_label): predictions_array, true_label = predictions_array, true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color=”#777777”) plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array) thisplot[predicted_label].set_color(‘red’) thisplot[true_label].set_color(‘blue’) numbers = [‘0’,’1’,’2’,’3’,’4’,’5’,’6’,’7’,’8’,’9’] plot_value_array(1, predictions[random_image], y_test) _ = plt.xticks(range(10), numbers, rotation=45) plt.show() ```
callbacks = myCallback()
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax) ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, callbacks=[callbacks])
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()])
predictions = probability_model.predict(x_test)
predictions[1000]
np.argmax(predictions[1000])
y_test[1000]
def plot_image(i, predictions_array, true_label, img): predictions_array, true_label, img = predictions_array, true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([])
plt.imshow(img, cmap=plt.cm.binary)
predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red'
plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color)
def plot_value_array(i, predictions_array, true_label): predictions_array, true_label = predictions_array, true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array)
thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue')
i = 1000 plt.figure(figsize=(6,3)) plt.subplot(1,2,1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(1,2,2) plot_value_array(i, predictions[i], y_test) plt.show()
num_rows = 5 num_cols = 3 num_images = num_rowsnum_cols plt.figure(figsize=(22num_cols, 2num_rows)) for i in range(num_images): plt.subplot(num_rows, 2num_cols, 2i+1) plot_image(i, predictions[i], y_test, x_test) plt.subplot(num_rows, 2num_cols, 2i+2) plot_value_array(i, predictions[i], y_test) plt.tight_layout() plt.show()
If we simply did this with training data, the model has already seen this data, so we could run into the issue of the model simply memorizing the classification of the training data and not actually being able to classify
Could you have identified the term for this phenomenon that is common to neural networks?
f we collected testing data but did not know its classification, we would have no way of telling how well the model actually works
good
The Hudgins house is a 3 bedroom house that costs about $100K, but the model predicts that it would cost $215K, more than double the price.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
The two answers are not the same, but they are very close. This is because the model trains on the data by guessing and then using the loss function’s output with the optimizer to produce smaller and smaller losses. The prediction converges to the same value (of 22 in this case), so the difference is very minimal, but the output is always slightly different.
Good
According to Maroney, the difference between traditional programming and machine learning is that traditional programming involves inputting rules and data for the computer to produce the answers as the output, but machine learning takes data and answers as the input and the computer tries to determine the rules based on the data and answers.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
For example, mean squared error
Good MSE (but this is an incomplete sentence)
cause errors
Do you mean increasing the probability of producing a more error prone forecast when using the model to predict?
The data is split into two sets to reduce overfitting and to better verify its accuracy. The model is only trained off of the training data. It’s accuracy on new data can be tested using the test data, as it has never seen the test data. The test data stands in for what we would be trying to predict, but with answers so you can verify it.
good
Moon is the worst deal, with my model predicting that you’d pay $109,710 than you should. Hudgins was the best deal, with my model predicting that you’d pay $125,186 less than it should cost.
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
The answers are different because the neural network is retrained and fit on the data. If I had set a random seed the answers might have been the same
Yes! (I got a chuckle from the fact you identified how to reproduce the same answer twice using seed). You could have also described the probabilistic nature of the NN in serving to produce two almost the same results, yet still slightly different. Isn't this also a remarkable fact!
In traditional programming you use rules you right and data to get answers. Machine learning takes data and answers to make a set of rules for predicting on future data.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines
We also see the model becoming overfit in the accuracy graph
good -- excellent work
but we use sparse_cateforical-crossentropy here because it is suited to finding “wrongness” in computer vision cases
excellent!
Based off of the model Maroney’s exercise outlines, only the house on Holly Point Rd (file name hudgins) is a good deal (3 bedrooms is predicted to mean the house is around 200k, while this house is selling at near 100k). The worst deal is the house on Church St (file name church), which sells at 399k, about 150k above its modeled price.
Good. Could you have provided a simple chart of the continuum of prices in order to support your finding?
The episode introduces machine learning as answers and data generating rules/relationships instead of a programmer figuring out rules to compute answers. This means that activity that is hard to find a set of rules for as a programmer can be modeled by a network of “neurons” and then can generate predictions based on that network. He demonstrates this idea by setting + up a single neuron machine learner which is used to generate the function of a line from a list of values derived from said function. Having only a signle neuron presents a very + simple situation where we see only one rule that presumably will lead to a lot of rules being generated from a large set of neurons interconnected.
Excellent!
This graph displays the accuracy of the model in predicting the connotation of a movie review compared to the number of epochs the model cycled through. The positive trend is typical as the optimizer is adjusting the cost function with the returns from the loss function. There seems to be a platue at 4 epochs, meaning after 4 epochs we are gaining less returns in run time versus accruacy and may be overfitting the model.
Excellent work. I don't know why, but they usually present the results from the accuracy function first then the loss function second. Maybe, as it relates to assessing overfitting. Just an observation.
The loss function gives penalties to the model predictions that are incorrect. These penalities are fed into the optimizer that adjusts the weights for each neuron based upon the size of the penality from the loss function.
Could you elaborate on how this is measured?
He splits the data into training and testing because if we only trained the model on one set of data, it will get overfit. It allows for the verification of the model’s accuracy in identifyng images it has not seen before in training.
good
When subtracting the predicted cost from the posted cost, house 2 yields the largest negative value, meaning it is the largest difference between the estimated price and what you would actually pay. This is the best deal because the buyinng offer is the lowest compared to the predicted price. On the contrary, house 5, when the predicted price is subtracted from the buying price, yields the largest positive value. This means the asking price is over the amount that is predicted based upon other offers and therefore is the worst deal.
Good
The best deal is house 2 ($97,000 for 3 bedrooms) and the worst deal is house 5 ($250,000 for 2 bedrooms) This is explained when comparing the prices offered to the prices predicted by the 1 layer 1 neuron nueral network that are posted below: 2 beds - 169.26517 3 beds - 234.5404 4 beds - 299.81564 5 beds - 365.09088
OK, but could you add a table to more clearly communicate your results. Perhaps order from best "deal" to worst?
22.002247 & 21.999706 These answers are different because the algorithm answers in probabilities of the best answer.
Good
Traditional programming had the inputs of the data and rules which gave an answer. Machine learning takes the data and answers as an input and outputs the rules needed.
Good. Could you elaborate on why this is significant? What aspect of data science has made this new design not only possible but also viable in its useful application towards addressing a multitude of research questions across a spectrum of disciplines?
I’ve noticed that a large gap seen in the literature, is that scientists are failing to recognize how a number of different covariates work in tandem with one another to spread the disease.
exceptional, we should focus more on variable interactions -- Poisson, Cox & Gibbs point process models might help us
To accurately distribute resources and further understand how rates are changing, various geospatial data methods investigating the impact natural disasters have on nutrition rates were analyzed and compared, including: DHS data, spatial video, geographically weighted regression (GWR), and ordinary least squares regression models (OLS).
Geographically weighted regression (GWR) and ordinary least squares regression models (OLS) that used DHS and spatial video data sources were investigated in order to assess the effectiveness of post natural disaster resources distribution in terms of nutrition.
In 2010, a 7.0 magnitude earthquake struck Haiti and devastated millions. As an immediate effect of this natural disaster, child malnutrition became a relevant developmental issue in Haiti. The disaster substantially impacted sectoral conditions such as drinking water, sanitation, energy/fuel, food supply, healthcare, and clearing of debris by the disaster. Hence, these conditions note to have a disruption on the direct development of Haiti and stunted the process. When considering solutions, humanitarian aid narrowed in on improving these sectoral and household conditions to lessen the rate.
I think this section could be distilled down to two sentences. Could you include a quantitive measure that connects the tangible significance of harms.
Nutrition is a quintessential sustainable development goal. Child malnutrition can be a cause of long-term effects such as inadequate dietary intake and diseases, as well as short term effects like natural disasters and political turmoil.
Try to synthesis these two sentences. I like your introductory statement, excellent hook. Could you likewise continue to capture attenuation through integrating the following statement?
more often than not their institutions are given less credit than they deserve
interesting, and I suspect it's probably true
Where Russia differs is localized regions within the country that experience different natures of hardship, thus a universal or average solution does not work.
good
Excellent work. Central focus clearly defined. Some thoughts.
(1) Are you considering some of the machine learning approaches presented on WorldPop as a basis for understanding the the informal sector in Cameroon? Random forest and hierarchical bayesian models present promise in this area. Gravity type models using CDR data could also be useful to describe behavior and movement.
(2) You have identified a number of quantitative model from what appears to be tradition economics or development Econ. Have any of these recently been extended to incorporate data science methods (machine learning)?
Keep going!
analyzed and displayed geospatially as rasterized data
good
Columbia’s annual manufacturing survey which provides information such as sales, wages, employment, capital, input prices and output for companies with at least ten employees. They also used the Registry of Violence which provides direct information about internally displaced people, such as their original municipality, the date they left, new municipality, and socioeconomic status
good
household statistics to create counterfactual analysis and identify which attributes were effective in eliminating economic inequality
interesting - I am detecting an emerging theme
Gini index
good
OLS regression tests
ok
counterfactual samples to make predictions on what factors such as tax rates, government revenue, and job creation had on the profit of these businesses
great
distance to the nearest roads and the average road density of the cities the businesses reside in
agreed - real estate prices are often largely set by the number of average daily trips for the road a piece of property directly adjoins
geospatial analysis of the General Enterprise Census and National Survey on Employment and the Informal Sector to map out the areas of informal and formal businesses and performed cross section testing to identify what types of areas were most profitable and in what industry
interesting
formal and informal businesses, with one being taxed and regulated by the Cameroon government while the latter is in a gray area which is not taxed nor regulated by the government
good
the informal economy is not a cause, but rather a consequence of existing poor conditions in LMIC’s such as Cameroon
excellent - emergence
rural-urban migration where those who work in informal sector in the rural side move to the urban side either temporarily or permanently in the search of a job that much higher pay than their current one (Todaro)
wow - good
size of the informal economy is also an indirect indicator of the conditions of many regions
good
Cameroon’s large informal economy exploits the current rampant economic inequality that exists in Cameroon, causing unfair competition with formal businesses and discourages economic growth in rural regions, which is ultimately a result of the significant lack of resources in the rural regions such as access to high-quality infrastructure, job opportunities, and education.
good
Excellent work. Good job defining your thematic focus, urbanization throughout India. A few thoughts.
(1) You've done a great job of capturing the three major themes as related to urbanization. First you have begun to identify some of the sources that have emerged in order to describe urban populations and their demographic composition. Gridded populations and other methods such as those found on WorldPop will be inherently useful in your critical analysis of the literature. Have you considered such machine learning approaches as random forest and also hierarchical bayesian models? You also have begun to touch on urbanization as a complex system, and wonder if you will include such ideas as fractals and power laws in your review. Finally, the later source also focuses on high resolution description of urban areas - buildings, and possibly their use classification. Will be very interested to know more about the methods you select.
(2) Have you identified a gap in the literature? Can you formulate your research question? What type of puzzle do you think your investigation into urbanization throughout India presents?
Just keep going!
Altogether, spider charts precisely display the characteristics of the twelve cities. Therefore, the analysis answers the author’s question on India’s urban growth types.
Very good, but a bit dated. I know Taubenbock from when I was at the TU Berlin, he did some work with one of my colleagues. They used LiDAR to capture the 3D signature of building envelopes in order to classify building type /land use from a building by building resolution. The HRSL is similar to this. Taubenbock used to be the head of a research group at the DLR in Germany (kind of like the German NASA). I wonder what type of work they are doing currently?
classifying urban types
good
object-oriented hierarchical approach works well on measuring changes in the urban extension
good
multitemporal remote sensing for analyzing rapid urban changes. Time-series of Landsat data
good
Taubenböck, H.
good source
urban dynamics
I'm wondering about these different models - are these agent-based models? is there a method that uses a mathematical formula to forecast growth? The authors work seems interesting