6 Matching Annotations
  1. Nov 2021
    1. If you don't have that information, you can determine which frequencies are important by extracting features with Fast Fourier Transform. To check the assumptions, here is the tf.signal.rfft of the temperature over time. Note the obvious peaks at frequencies near 1/year and 1/day:

      Do a fft with tensorflow

      fft = tf.signal.rfft(df['T (degC)'])
      f_per_dataset = np.arange(0, len(fft))
      
      n_samples_h = len(df['T (degC)'])
      hours_per_year = 24*365.2524
      years_per_dataset = n_samples_h/(hours_per_year)
      
      f_per_year = f_per_dataset/years_per_dataset
      plt.step(f_per_year, np.abs(fft))
      plt.xscale('log')
      plt.ylim(0, 400000)
      plt.xlim([0.1, max(plt.xlim())])
      plt.xticks([1, 365.2524], labels=['1/Year', '1/day'])
      _ = plt.xlabel('Frequency (log scale)')
      
  2. Sep 2021
  3. Aug 2021
    1. We think R is a great place to start your data science journey because it is an environment designed from the ground up to support data science. R is not just a programming language, but it is also an interactive environment for doing data science. To support interaction, R is a much more flexible language than many of its peers. This flexibility comes with its downsides, but the big upside is how easy it is to evolve tailored grammars for specific parts of the data science process. These mini languages help you think about problems as a data scientist, while supporting fluent interaction between your brain and the computer.
    2. If you’re routinely working with larger data (10-100 Gb, say), you should learn more about data.table. This book doesn’t teach data.table because it has a very concise interface which makes it harder to learn since it offers fewer linguistic cues. But if you’re working with large data, the performance payoff is worth the extra effort required to learn it.
  4. Feb 2017
  5. Sep 2015