18 Matching Annotations
  1. Dec 2021
  2. Nov 2021
    1. If you don't have that information, you can determine which frequencies are important by extracting features with Fast Fourier Transform. To check the assumptions, here is the tf.signal.rfft of the temperature over time. Note the obvious peaks at frequencies near 1/year and 1/day:

      Do a fft with tensorflow

      fft = tf.signal.rfft(df['T (degC)'])
      f_per_dataset = np.arange(0, len(fft))
      
      n_samples_h = len(df['T (degC)'])
      hours_per_year = 24*365.2524
      years_per_dataset = n_samples_h/(hours_per_year)
      
      f_per_year = f_per_dataset/years_per_dataset
      plt.step(f_per_year, np.abs(fft))
      plt.xscale('log')
      plt.ylim(0, 400000)
      plt.xlim([0.1, max(plt.xlim())])
      plt.xticks([1, 365.2524], labels=['1/Year', '1/day'])
      _ = plt.xlabel('Frequency (log scale)')
      
    2. Now, peek at the distribution of the features. Some features do have long tails, but there are no obvious errors like the -9999 wind velocity value.

      indeed, peek. we are looking at test data too.

      df_std = (df - train_mean) / train_std
      df_std = df_std.melt(var_name='Column', value_name='Normalized')
      plt.figure(figsize=(12, 6))
      ax = sns.violinplot(x='Column', y='Normalized', data=df_std)
      _ = ax.set_xticklabels(df.keys(), rotation=90)
      
    3. It is important to scale features before training a neural network. Normalization is a common way of doing this scaling: subtract the mean and divide by the standard deviation of each feature. The mean and standard deviation should only be computed using the training data so that the models have no access to the values in the validation and test sets. It's also arguable that the model shouldn't have access to future values in the training set when training, and that this normalization should be done using moving averages.

      moving average to avoid data leak

    4. Similarly, the Date Time column is very useful, but not in this string form. Start by converting it to seconds:
      timestamp_s = date_time.map(pd.Timestamp.timestamp)
      

      and then create "Time of day" and "Time of year" signals:

      day = 24*60*60
      year = (365.2425)*day
      
      df['Day sin'] = np.sin(timestamp_s * (2 * np.pi / day))
      df['Day cos'] = np.cos(timestamp_s * (2 * np.pi / day))
      df['Year sin'] = np.sin(timestamp_s * (2 * np.pi / year))
      df['Year cos'] = np.cos(timestamp_s * (2 * np.pi / year))
      
  3. Jul 2021
    1. If you're serious about neural networks, I have one recommendation. Try to rebuild this network from memory.
    2. If you're serious about neural networks, I have one recommendation. Try to rebuild this network from memory.
  4. May 2021
    1. Right click on the post in the feed to see linked articles and related posts

  5. Oct 2020
  6. Jul 2020
  7. Nov 2019
  8. Jun 2019
    1. I ended up turning Documents and & Desktop sync off. I got frustrated with it because my data was constantly being uploaded and downloaded, wasting my bandwidth. But recently I found a tool on Github called iCloud Control. It adds a menu button to Finder that lets you remove local items, download items, and publish a public link to share your files.
  9. Jun 2016
    1. Hypothes.is

      Still not certain if one has to have phython program running on website the files live on in order for annotation via chrome plugin to work.

  10. Feb 2016
    1. MATH1280

      Simply highlight any part and comment here, so your peers can reply.

    1. req.Header.Add("Content-Type", writer.FormDataContentType())

      If you're reading this, do not forget the Content-Type. It is not on the initial example, but it is important. I don't understand why the author mentions it here but doesn't use it on the initial source.