- Aug 2024
-
www.youtube.com www.youtube.com
-
we are using set theory so a certain piece of reference text is part of my collection or it's not if it's part of my collection somewhere in my fingerprint is a corresponding dot for it yeah so there is a very clear direct link from the root data to the actual representation and the position that dot has versus all the other dots so the the topology of that space geometry if you want of that patterns that you get that contains the knowledge of the world which i'm using the language of yeah so that basically and that is super easy to compute for um for for a computer i don't even need a gpu
for - comparison - cortical io / semantic folding vs standard AI - no GPU required
-
- Apr 2023
-
webgpufundamentals.org webgpufundamentals.org
Tags
Annotators
URL
-
-
developer.mozilla.org developer.mozilla.org
Tags
Annotators
URL
-
-
yosefk.com yosefk.com
-
Very nice article explaining the HW performance of GPUs, from 2011.
-
- Jan 2023
-
-
Other hardware options do exist, including Google Tensor Processing Units (TPUs); AMD Instinct GPUs; AWS Inferentia and Trainium chips; and AI accelerators from startups like Cerebras, Sambanova, and Graphcore. Intel, late to the game, is also entering the market with their high-end Habana chips and Ponte Vecchio GPUs. But so far, few of these new chips have taken significant market share. The two exceptions to watch are Google, whose TPUs have gained traction in the Stable Diffusion community and in some large GCP deals, and TSMC, who is believed to manufacture all of the chips listed here, including Nvidia GPUs (Intel uses a mix of its own fabs and TSMC to make its chips).
Look at market share for tensorflow and pytorch which both offer first-class nvidia support and likely spells out the story. If you are getting in to AI you go learn one of those frameworks and they tell you to install CUDA
-
- Nov 2022
-
www.w3.org www.w3.orgWebGPU1
-
- Nov 2021
-
lilianweng.github.io lilianweng.github.io
-
two major memory consumption of large model training: The majority is occupied by model states, including optimizer states (e.g. Adam momentums and variances), gradients and parameters. Mixed-precision training demands a lot of memory since the optimizer needs to keep a copy of FP32 parameters and other optimizer states, besides the FP16 version. The remaining is consumed by activations, temporary buffers and unusable fragmented memory (named residual states in the paper).
深度网络训练中的显存开销主要是哪些?
-
It partitions optimizer state, gradients and parameters across multiple data parallel processes via a dynamic communication schedule to minimize the communication volume.
ZeRO-DP 的原理是什么?
-
- Apr 2021
-
statisticsplaybook.github.io statisticsplaybook.github.io
-
3.1 GPU 사용 가능 체크
여기서 FALSE가 발생하는 경우가 있습니다. 이 경우에 gpu tensor가 작동하지 않는데 GPU의 cuda version 호환에서 문제가 발생하는 것으로 알고 있습니다. 많은 곳에서 10.1, 10.2 version을 사용하기 때문에 저도 해당 버전을 깔아보았습니다. 하지만 cuda를 다시 깔고 라이브러리를 불러와도 여전히 FALSE가 뜨는 것을 볼수 있죠. 아래 코드를 입력해보시기 바랍니다. 1번째 코드는 당연히 본인의 쿠다 버전이 설치된 장소로 지정해주어야 합니다. 2번째 코드에서 에러코드가 발생할 수 있습니다만 패키지를 다시 인스톨 하시고 불러오시면 정상적으로 작동하는 것을 볼 수 있습니다.
Sys.setenv("CUDA_HOME" = "C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.2") source("https://raw.githubusercontent.com/mlverse/torch/master/R/install.R")<br> install.packages("torch")
-
- Apr 2020
-
gist.github.com gist.github.com
-
NVIDIA's CUDA libraries
cuda has moved to
homebrew-drivers
[1] its name has alos changed to nvidia-cudaTo install:
brew tap homebrew/cask-drivers
brew cask install nvidia-cuda
https://i.imgur.com/rmnoe6d.png
[1] https://github.com/Homebrew/homebrew-cask/issues/38325#issuecomment-327605803
-