19 Matching Annotations
  1. Feb 2023
  2. May 2022
    1. The source sequence will be pass to the TransformerEncoder, which will produce a new representation of it. This new representation will then be passed to the TransformerDecoder, together with the target sequence so far (target words 0 to N). The TransformerDecoder will then seek to predict the next words in the target sequence (N+1 and beyond).
  3. Oct 2021
    1. Even with this very primitive single neuron, you can achieve 90% accuracy when recognizing a handwritten text image1. To recognize all the digits from 0 to 9, you would need just ten neurons to recognize them with 92% accuracy.

      And here is a Google Colab notebook that demonstrates that

  4. Apr 2021
    1. On the median case, Colab is going to assign users a K80, and the GTX 1080 is around double the speed, which does not stack up particularly well for Colab. However, on occasion, when a P100 is assigned, the P100 is an absolute killer GPU (again, for FREE).

      Some of the GPUs from Google Colab are outstanding.

  5. Feb 2021
  6. Aug 2020
    1. Fastprogress is a clean, well-designed progress bar library brought to you by the fastai family.

      Might come in handy for tracking progress of long executions:

      ! pip install fastprogress
      from fastprogress import master_bar, progress_bar
    2. Remote in through VSCode using SSH and ngrok
      • You can connect to Colab remotely through VSCode with SSH and ngrok
      • Benefits: terminal access, not having to reenter github password, being able to edit .py files locally inside of VSCode
      • You might want to pay $10/month to have a static TCP (ngrok pro) to prevent disconnections
    3. Don’t forget to tell Git who you are, add this cell so you don’t have to answer every time you commit during a new session!

      Authenticate yourself with GitHub:

      !git config --global user.email <YOUR EMAIL>
      !git config --global user.name <YOUR NAME>
    4. This will allow you to grab both public and private repos without leaving your password exposed in the notebook.

      Connecting your GitHub:

      import os
      from getpass import getpass
      import urllib
      user = 'rbracco'
      password = getpass('Password: ')
      repo_name = 'fastai2_audio'
      # your password is converted into url format
      password = urllib.parse.quote(password)
      cmd_string = 'git clone https://{0}:{1}@github.com/{0}/{2}.git'.format(user, password, repo_name)
      cmd_string, password = "", "" # removing the password from the variable
      # Bad password fails silently so make sure the repo was copied
      assert os.path.exists(f"/content/{repo_name}"), "Incorrect Password or Repo Not Found, please try again"

      GIF workflow

    5. Gdown is a nice library for downloading large files from drive to colab.

      Use Gdown library to download large files, if:

      • the dataset is public
      • it's more than 10 GB
      • you're downloading it multiple times a day
    6. due to weird Google Drive quota issues, you are better off copying the archive to colab and decompressing it there than you are decompressing the archive while it is hosted on your drive

      Decompress your archives on Google Colab, not on the host machine

    7. I learned and did to make it possible to do Automated Speech Recognition research on a Colab instance.

      It's possible to do Automated Speech Recognition with Google Colab.

      Notebook instance of this post

    8. It has been an open secret that you can avoid getting disconnected on Colab by opening the console and entering JavaScript to click the reconnect button for you. It gets very old pressing Ctrl-Shift-I, finding this snippet, and pasting it in every time you start a new session, but Colab gives you the ability to run JavaScript from a cell using the %%javascript magic. Add this cell before your training loop and run it when you plan to do a long training run to avoid getting disconnected mid-training.
      • You can add JavaScript to Colab with %%javascript
      • It's worth to add the following snippet before training to avoid getting disconnected:
      function ClickConnect(){
    9. As you know, Colab deletes any files you’ve downloaded or created when you end a session. The best option is to use Github to store your code(details below), and Google Drive to store datasets, logs, and anything else that would normally reside on your filesystem but wouldn’t be tracked by git.
      • That's why you might want to run the code:

        from google.colab import drive
      • After running it you'll click a link and follow a 30 seconds process

      • Afterwards, all your files will be available at /contant/drive
    10. Uploads from your computer to google drive can be incredibly slow, especially when dealing with multiple GBs of data. Download speeds are much faster, so take advantage with the command ! wget -c -P save_path url This allows you to download the data only once saving you time and saving bandwidth for the generous owners of publicly hosted datasets.

      It's more efficient to get your datasets with ! whet -c -P save_path_url rather than uploading it:

      ! wget -c -P '/content/drive/My Drive/Colab Notebooks/data/' http://www.openslr.org/resources/12/train-clean-100.tar.gz
  7. Jul 2020
    1. This model is the most flexible and open-ended of the four; your goal as an instructor is not to design a full-fledged semester of material, activities, and assessments. Rather, your goal is to work with your class to design and become a learning community, working collaboratively and individually towards your determined learning goals. For this to work you should have: a set of possible/preferred learning objectives for your classa library of course materials, preferably with as much as possible in digital formata suggested list of digital tools and technologies that you’re comfortable from with a list of possible assignment/project/assessment ideas that are related to your learning objectivesa willingness to experiment and invite your students into the teaching & learning process. At the onset of class you will need to facilitate a conversation among you and your students about how the class will unfold. This can be done in small groups f2f, via an online communication tool, or in a hybrid mix of both. As a community you should plan on addressing the following: what are our objectives as a learning community? what kind of work could we engage in to meet these objectives? what physical/virtual spaces would we like to work in? how/when do we want to meet in these spaces?how do we want to measure (assess) if an objective has been met?what rules and policies should govern our work? how will we work virtually and respect everyone’s boundaries and personal situations? how will we work f2f and respect public health recommendations and personal situations? You will probably need to spend at least the first 1-2 weeks answering these questions together and then designing a plan for your course. Make sure you and your students talk through various complications: what if the university’s policies about meeting f2f change? what if classes are forced to move entirely virtual/remote? what someone (students or professor!) gets sick?

      This is the one for me!!!!

    2. c

      Apologies for highlighting whole swaths of paragraphs but it can't be helped sometimes lol.

    3. Finally, these are NOT meant to be comprehensive. Instead, imagine these models along a continuum of opportunity. Your challenge is to determine where your courses could fit between and among the proposals.  

      I'm wondering how much or how little faculty will need to change their curriculum/delivery depending on the various inevitable changes that we can't exactly predict will happen this school year. For those faculty member purposefully switching online, what changes have they made already, and what changes will become necessary in the near future?