2,505 Matching Annotations
  1. Nov 2024
    1. We’re leaving Kubernetes

      Why Gitpod is Leaving Kubernetes

      Gitpod has decided to transition away from Kubernetes for managing cloud development environments, opting instead for a custom-built solution better suited to their needs. While Kubernetes is powerful for orchestrating stateless application workloads, Gitpod identified several challenges that made it less ideal for their dynamic, stateful workloads.

      Key Challenges of Kubernetes

      • Resource Overhead: Kubernetes introduces significant complexity and resource consumption, which is inefficient for scaling ephemeral development environments.

      • Latency in Scaling: The time required to scale pods and handle stateful workloads can slow down developer workflows that demand near-instant provisioning.

      • Stateful Workloads: Kubernetes is designed for stateless applications, and adapting it for stateful environments adds operational complexity.

      • Cost Inefficiency: Running dynamic workloads on Kubernetes incurs higher operational costs due to the constant need for scaling and resource orchestration.

      • Security Concerns: Managing multi-tenant security on Kubernetes is challenging, requiring considerable effort to ensure workload isolation and permission control.

      • Operational Complexity: Maintaining Kubernetes clusters at scale involves a significant operational burden, including updates, monitoring, and configuration management.

      Gitpod is now focusing on Gitpod Flex, a new solution tailored to better meet the demands of developers, offering improved scalability, efficiency, and simplicity.

    1. Data scientists, MLOps engineers, or AI developers, can mount large language model weights or machine learning model weights in a pod alongside a model-server, so that they can efficiently serve them without including them in the model-server container image. They can package these in an OCI object to take advantage of OCI distribution and ensure efficient model deployment. This allows them to separate the model specifications/content from the executables that process them.

      The introduction of the Image Volume Source feature in Kubernetes 1.31 allows MLOps practitioners to mount OCI-compatible artifacts, such as large language model weights or machine learning models, directly into pods without embedding them in container images. This streamlines model deployment, enhances efficiency, and leverages OCI distribution mechanisms for effective model management.

    1. Deploying Machine Learning Models with Flask and AWS Lambda: A Complete Guide

      In essence, this article is about:

      1) Training a sample model and uploading it to an S3 bucket:

      ```python from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression import joblib

      Load the Iris dataset

      iris = load_iris() X, y = iris.data, iris.target

      Split the data into training and testing sets

      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

      Train the logistic regression model

      model = LogisticRegression(max_iter=200) model.fit(X_train, y_train)

      Save the trained model to a file

      joblib.dump(model, 'model.pkl') ```

      1. Creating a sample Zappa config, because AWS Lambda doesn’t natively support Flask, we need to use Zappa, a tool that helps deploy WSGI applications (like Flask) to AWS Lambda:

      ```json { "dev": { "app_function": "app.app", "exclude": [ "boto3", "dateutil", "botocore", "s3transfer", "concurrent" ], "profile_name": null, "project_name": "flask-test-app", "runtime": "python3.10", "s3_bucket": "zappa-31096o41b" },

      "production": {
          "app_function": "app.app",
          "exclude": [
              "boto3",
              "dateutil",
              "botocore",
              "s3transfer",
              "concurrent"
          ],
          "profile_name": null,
          "project_name": "flask-test-app",
          "runtime": "python3.10",
          "s3_bucket": "zappa-31096o41b"
      }
      

      } ```

      1. Writing a sample Flask app:

      ```python import boto3 import joblib import os

      Initialize the Flask app

      app = Flask(name)

      S3 client to download the model

      s3 = boto3.client('s3')

      Download the model from S3 when the app starts

      s3.download_file('your-s3-bucket-name', 'model.pkl', '/tmp/model.pkl') model = joblib.load('/tmp/model.pkl')

      @app.route('/predict', methods=['POST']) def predict(): # Get the data from the POST request data = request.get_json(force=True)

      # Convert the data into a numpy array
      input_data = np.array(data['input']).reshape(1, -1)
      
      # Make a prediction using the model
      prediction = model.predict(input_data)
      
      # Return the prediction as a JSON response
      return jsonify({'prediction': int(prediction[0])})
      

      if name == 'main': app.run(debug=True) ```

      1. Deploying this app to production (to AWS):

      bash zappa deploy production

      and later eventually updating it:

      bash zappa update production

      1. We should get a URL like this:

      https://xyz123.execute-api.us-east-1.amazonaws.com/production

      which we can query:

      curl -X POST -H "Content-Type: application/json" -d '{"input": [5.1, 3.5, 1.4, 0.2]}' https://xyz123.execute-api.us-east-1.amazonaws.com/production/predict

    1. I’m writing this on October 15th, 2024. Last week I would’ve said you probably shouldn’t be using uv’s Python in production, because you wouldn’t be getting security updates to OpenSSL. This week, I would tentatively say that it’s fine. This makes me a little uncomfortable, because there may well be other issues I haven’t thought of, and uv is still very new.

      You may use uv in production, but there may be still some undiscovered quirks.

    2. The uv-provided Python executable is slower than the one shipped by Ubuntu 24.04 LTS, but it’s faster than the “official” Docker image.
    3. The ability to install Python with uv adds interesting possibilities for production packaging. For example, you can use an Ubuntu 24.04 base Docker image, download uv, and rely on uv to trivially install any Python version. Which is to say, you won’t be limited to the versions Ubuntu packages for you.
    4. Unlike most Python packaging tools, uv doesn’t require Python to be installed to use it.

      About uv Python packaging tool

    1. Optimizing Kubernetes Costs with Multi-Tenancy and Virtual Clusters

      The blog post by Cliff Malmborg from Loft Labs discusses optimizing Kubernetes costs using multi-tenancy and virtual clusters. With Kubernetes expenses rising rapidly at scale, traditional cost-saving methods like autoscaling, resource quotas, and monitoring tools help but are not enough for complex environments where underutilized clusters are common. Multi-tenancy enables resource sharing, reducing the number of clusters and, in turn, management and operational costs.

      A virtual cluster is a fully functional Kubernetes cluster running within a larger host cluster, providing better isolation and flexibility than namespaces. Unlike namespaces, each virtual cluster has its own Kubernetes control plane, so resources like statefulsets and webhooks are isolated within it, while only core resources (like pods and services) are shared with the host cluster. This setup addresses the "noisy neighbor" problem, where workloads in a shared environment interfere with each other due to resource contention.

      Virtual clusters offer the isolation benefits of individual physical clusters but are cheaper and easier to manage than deploying separate physical clusters for each tenant or application. They also support "sleep mode," automatically scaling down unused resources to save costs, and allow shared use of central tools (like ingress controllers) installed in the host cluster. By transitioning to virtual clusters, companies can balance security, isolation, and cost-effectiveness, reducing the need for multiple physical clusters and making Kubernetes infrastructure scalable for modern, resource-demanding applications.

    1. Cena złota przekroczyła 2700 USD za uncję Od początku tego roku złoto wyceniane w USD podrożało o blisko 32%, a w PLN o niecałe 33%. Metal zmierza do osiągnięcia najlepszego roku notowań od 45 lat.

      Gold has risen in price by 33% since the beginning of the year and is now worth PLN 10,780 per ounce. This is almost the best year for this precious metal in 45 years.

    1. Jak wynika z założeń planowanych zmian już niedługo korzystanie z 5% podatku dochodowego przez programistów może zostać ograniczone. Programista, który będzie chciał korzystać z IP BOX będzie musiał zatrudniać przynajmniej trzech pracowników na umowę o pracę przez 300 dni w skali roku. Możliwe będzie zatrudnienie 3 osób na umowę zlecenia, ale pod warunkiem, że miesięczne wynagrodzenie będzie wynosiło 3-krotność minimalnego wynagrodzenia (ok. 24 500 zł miesięcznie). Zmiany mogą zacząć obowiązywać już w 2025 roku, oznacza to, że wielu programistów prowadzących jednoosobowe działalności gospodarcze nie będzie mogło już korzystać z obniżonej 5% stawki podatku dochodowego.

      It may be over of IP BOX (5%) in 2025.

    1. Jeśli informatyk świadczy usługi związane z oprogramowaniem, nie ma prawa do 8,5 proc. zryczałtowanego podatku. Musi płacić 12 proc.

      This article (which can be viewed using a paywall remover) shows that even if someone does not deal directly with code, the court may think otherwise:

      Fiskus i NSA nie pozwalają na niższą stawkę ryczałtu W sprawie, która doszła do NSA, spór ze skarbówką zaczął się od wniosku o interpretację informatyka prowadzącego jednoosobową działalność. Napisał, że zajmuje się projektowaniem i rozwojem technologii informatycznych dla sieci i systemów komputerowych lub ich poszczególnych składowych/komponentów. Świadczy też usługi pomocy technicznej. Szczegółowo opisał wszystkie czynności. Podkreślił też, że do zakresu jego obowiązków nie należy tworzenie oprogramowania. Dlatego uważa, że ma prawo do niższej stawki ryczałtu. Fiskus miał jednak inne zdanie. Uznał, że informatyk powinien płacić 12 proc. ryczałt, ponieważ „okoliczność, że do zakresu obowiązków podatnika nie należy tworzenie oprogramowania komputerowego, nie jest wystarczającą przesłanką do uznania braku związku świadczonych usług z oprogramowaniem”.

      Informatyk zaskarżył interpretację, przegrał jednak zarówno w pierwszej, jak i drugiej instancji. Wojewódzki Sąd Administracyjny w Gliwicach zauważył, że z wniosku o interpretację wynika, iż świadczone usługi są związane z rozwojem systemu SAP. Ten system jest oprogramowaniem wspomagającym prowadzenie przedsiębiorstwa. Opisane usługi są więc niewątpliwie związane z oprogramowaniem – uznał gliwicki WSA. Za szerokim rozumieniem pojęcia „usługi związane z oprogramowaniem” opowiedział się też NSA. Podkreślił, że nie chodzi tylko o programowanie, ale również inne czynności. Także te, które wykonuje przedsiębiorca.

      Sąd wymienił wszystkie obowiązki informatyka i po prostu stwierdził, że są to usługi związane z oprogramowaniem. Po NSA spodziewałbym się głębszej analizy i dokładnego określenia, które z wymienionych czynności spełniają to kryterium. Miejmy nadzieję, że kolejne orzeczenia będą wnikliwsze. I po głębszych rozważaniach może okaże się, że to jednak podatnicy mają rację – podsumowuje Piotr Sekulski

  2. Oct 2024
    1. Niewielkie wzrosty odnotował też marketing i sprzedaż - o 1 proc. czy sektor IT - o 5 proc.

      In September, employers in Poland published 12% more job offers y/y (288 thousand).

      The increase has been going on for 4 months, and now it has reached the highest level since March 2022. The largest increase in the number of job offers is in medical professions (24%) and manual workers (13%). Decrease among financiers (-15%), HR specialists (-10%) and lawyers (-7%). In IT, the number of offers increased by 5%, and in marketing and sales by 1%.

    1. Remember to bring DVI/VGA-to-DisplayPort adapter if you own a Mac

      Good advice for the presenters using macOS

  3. Sep 2024
    1. Design, Setting, and Participants  In a population-based registry study, data on all Finnish citizens born between January 1, 1985, and December 31, 1997, whose demographic, health, and school information were linked from nationwide registers were included. Cohort members were followed up from August 1 in the year they completed ninth grade (approximately aged 16 years) until a diagnosis of mental disorder, emigration, death, or December 31, 2019, whichever occurred first. Data analysis was performed from May 15, 2023, to February 8, 2024.

      Mental disorders are indirectly contagious – i.e. negative emotional and behavioral patterns that cause illness are transferred even to friends of people with disorders, a study in Finland involving 700,000 people has shown.

      The data showed that having friends diagnosed with mental disorders in the 9th grade of secondary school increased the risk of developing mental disorders later in life, such as mood swings, anxiety and eating disorders, by up to 18%.

    1. Regularne spożywanie umiarkowanych ilości kawy i herbaty może chronić przed rozwojem wielu chorób kardiometabolicznych, w tym cukrzycy typu 2, choroby wieńcowej i udaru, tak przynajmniej wynika z nowych badań przeprowadzonych przez szwedzkich oraz chińskich naukowców.

      Drinking 3 cups of coffee or 200-300 mg of caffeine a day can halve the risk of diseases such as type 2 diabetes, coronary artery disease and stroke, researchers from Suzhou University in China, in collaboration with Swedish scientists, have shown.

      Moderate caffeine consumption may protect cardiovascular health, regardless of age, gender, smoking or diet. The study is based on data from over 300,000 people from the UK Biobank, collected over 11 years.

  4. Aug 2024
    1. Slashing Data Transfer Costs in AWS by 99%

      The essence of cutting AWS data transfer costs by 99% is to use Amazon S3 as an intermediary for data transfers between EC2 instances in different Availability Zones (AZs). Instead of direct transfers, which incur significant costs, you upload the data to S3 (free upload), and then download it within the same region (free download). By keeping the data in S3 only temporarily, you minimize storage costs, drastically reducing overall transfer expenses.

    1. The seasoned engineer learns that sometimes the best code is the code you never wrote. They become adept at delegating tasks, capitalizing on the strengths of their colleagues, and asking the dreaded question, "But why?" — a question that often leads to the heart of what needs to be solved, avoiding unnecessary work and focusing on what truly adds value.
  5. Jun 2024
    1. Polacy ciągle się bogacą. Wzrasta liczba dobrze i bardzo dobrze zarabiających

      Infopiguła:

      Liczba Polaków zarabiających ponad 10 tys. zł / mc wzrosła w 2022 r. o 51% rdr., do 1,5 mln osób. Na koniec 2022 r. było w Polsce 90 tys. milionerów dolarowych, 10% mniej niż rok wcześniej.

      Pensje zarabiających ponad 10 tys. wzrosły o 10% do ok. 375 mld zł. Głównym czynnikiem była wysoka inflacja, ale też napływ Ukraińców z dobrymi zarobkami w korporacjach.

      Liczba osób z zarobkami 20-50 tys. zł / mies. wzrosła o 38% do 440 tys., tych z zarobkami między 50 a 83 tys. zł wzrosła o 1% do 84 tys., a osób z zarobkami ponad 83 tys. zł (czyli 1 mln rocznie) spadła o 5% do 35 tys. O prawie 2% wzrosła liczba osób z majątkiem ponad 50 mln $.

    1. Neither of the methods shown above are ideal in environments where you require several clusters or need them to be provisioned in a consistent way by multiple people.

      In this case, IaC is favored over using EKS directly or manually deploying on EC2

    2. Running a cluster directly on EC2 also gives you the choice of using any available Kubernetes distribution, such as Minikube, K3s, or standard Kubernetes as deployed by Kubeadm.
    3. EKS is popular because it’s so simple to configure and maintain. You don’t need to understand the details of how Kubernetes works or how Nodes are joined to your cluster and secured. The EKS service automates cluster management procedures, leaving you free to focus on your workloads. This simplicity can come at a cost, though: you could find EKS becomes in-flexible as you grow, and it might be challenging to migrate from if you switch to a different cloud provider.

      Why use EKS

    4. The EKS managed Kubernetes engine isn’t included in the free tier. You’ll always be billed $0.10 per hour for each cluster you create, in addition to the EC2 or Fargate costs associated with your Nodes. The basic EKS charge only covers the cost of running your managed control plane. Even if you don’t use EKS, you’ll still need to pay to run Kubernetes on AWS. The free tier gives you access to EC2 for 750 hours per month on a 12-month trial, but this is restricted to the t2.micro and t3.micro instance types. These only offer 1 GiB of RAM so they’re too small to run most Kubernetes distributions.

      Cost of EKS

    5. Some of the other benefits of Kubernetes on AWS include

      Benefits of using Kubernetes on AWS: - scalability - cost efficiency - high availability

    1. Note that the Python documentation refers to these as special methods and notes the synonym "magic method" but very rarely uses the term "dunder method". However, "dunder method" is a fairly common Python colloquialism, as noted in my unofficial Python glossary.

      special methods = magic methods = dunder methods

    1. python -m webbrowser https://pym.dev/p

      Opening URL using Python's webbrowser module

    1. Sample .devcontainer/devcontainer.json:

      json { "name": "Global", "build": { "context": "..", "dockerfile": "Dockerfile" }, "containerEnv": { "PYTHONPATH": "." }, "customizations": { "vscode": { "settings": { "extensions.verifySignature": false }, "extensions": [ "GitHub.copilot", "ms-python.vscode-pylance", "ms-python.python", "eamodio.gitlens" ] } }, "initializeCommand": "/bin/bash -c '[[ -d ${HOME}/.aws ]] || { echo \"Error: ${HOME}/.aws directory not found.\"; exit 1; }; [[ -f ${HOME}/.netrc ]] || { echo \"Error: ${HOME}/.netrc file not found.\"; exit 1; }; [[ -d ${HOME}/.ssh ]] || { echo \"Error: ${HOME}/.ssh directory not found.\"; exit 1; }; echo \"\n> All required mounts found on the host machine.\"'", "onCreateCommand": { "hadolint": "apt-get update && apt-get install wget -y && wget -O /bin/hadolint https://github.com/hadolint/hadolint/releases/download/v2.12.0/hadolint-Linux-x86_64 && chmod u+x /usr/bin/hadolint", "precommit": "pip install pre-commit" }, "updateContentCommand": "/bin/bash -c 'if grep -A 2 \"machine gitlab.com\" ~/.netrc | grep -q \"password\" && GITLAB_TOKEN=$(grep -A 2 \"machine gitlab.com\" ~/.netrc | grep -oP \"(?<=password ).*\" | tr -d \"\\n\") && [ -n \"$GITLAB_TOKEN\" ]; then echo \"\n> Token found in ~/.netrc\"; else read -sp \"\n> Enter your GitLab token: \" GITLAB_TOKEN && echo; fi; echo \"export GITLAB_TOKEN=$GITLAB_TOKEN\" >> ~/.bashrc && . ~/.bashrc && poetry config http-basic.abc __token__ $GITLAB_TOKEN'", "postCreateCommand": ". ~/.bashrc && curl -s --location 'https://gitlab.com/api/v4/projects/12345/repository/files/.pre-commit-config.yaml/raw?ref=main' --header \"PRIVATE-TOKEN: $GITLAB_TOKEN\" -o .pre-commit-config.yaml", "postAttachCommand": "/bin/bash -c '. ~/.bashrc && read -p \"\n> Do you want to update the content of devcontainer.json? (y/n): \" response; if [[ \"$response\" == \"y\" ]]; then curl -s --location \"https://gitlab.com/api/v4/projects/12345/repository/files/devcontainer.json/raw?ref=main\" --header \"PRIVATE-TOKEN: $GITLAB_TOKEN\" -o .devcontainer/devcontainer.json; else echo \"\n> Skipping update of devcontainer.json\"; fi'", "mounts": [ "source=${localEnv:HOME}/.aws/,target=/root/.aws/,type=bind,readonly", "source=${localEnv:HOME}/.netrc,target=/root/.netrc,type=bind,readonly", "source=${localEnv:HOME}/.ssh/,target=/root/.ssh/,type=bind,readonly" ] }

    2. Some more of my recent learning with devcontainer.json (its Dev Container metadata):

      • Interactive commands (those waiting for user input like read) do not display the input request in (at least onCreateCommand and postCreateCommand sections), so it is better to keep them in updateContentCommand or postAttachCommand.
      • If there are 2 read commands in a single section, like updateContentCommand, only the 1st one is displayed to the user, and the 2nd one is ignored.
      • When I put a read command within a dictionary (with at lest 2 key/values) of postAttachCommand, the interactive command wasn't being displayed.
      • We need to use /bin/bash -c to be able to use read -s (the -s flag) which allows for securely passing the password so that it does not stay in the VS Code console. Also, I had trouble with interactive commands and if statements without it.
      • Using "GITLAB_TOKEN": "${localEnv:GITLAB_TOKEN}" does not easily work as it is looking for GITLAB_TOKEN env variable set locally on our host computers, and I believe no one does it.
      • The dictionary seems to be executing its scripts in parallel; therefore, it is not easily possible to break down long lines which have to execute in a chronological sequence.
      • JSON does not allow for human-readable line breaks; therefore, indeed, it seems impossible to improve the long one-liners.
      • The files/folders mentioned within mounts need to exist locally (otherwise, Docker container build fails). They are mounted before any other section. Technically, we can protect ourselves with the following command to find an extra message in VS Code container logs:

      json "initializeCommand": "/bin/bash -c '[[ -d ${HOME}/.aws ]] || { echo \"Error: ${HOME}/.aws directory not found.\"; exit 1; }; [[ -f ${HOME}/.netrc ]] || { echo \"Error: ${HOME}/.netrc file not found.\"; exit 1; }; [[ -d ${HOME}/.ssh ]] || { echo \"Error: ${HOME}/.ssh directory not found.\"; exit 1; }'",

      Other option is to get rid of the error completely, but this creates files on the host machine; therefore, it is not an ideal solution:

      json "initializeCommand": "mkdir -p ~/.ssh ~/.aws && touch ~/.netrc",

    3. ["bash", "-i", "-c", "read -p 'Type a message: ' -t 10 && echo Attach $REPLY"],

      I would also simply put the following:

      bash /bin/bash -c 'read -p 'Type a message: ' -t 10 && echo Attach $REPLY'

    4. Consequently, it’s one of the only commands that consistently allows interactions with users.

      I also found that updateContentCommand allows for the user interaction (it displays interactive command in the VS Code console).

    5. There are six available lifecycle script hooks

      Explanation of 6 available devcontainer.json (Dev Container in VS Code) hooks.

    1. In one study led by researchers at The University of Oxford, participants with insomnia were divided into two groups and given fake or "sham" feedback on their sleep.One group was told they had a "positive" night's sleep, the other a "negative" night's sleep, and were then asked to rate their mood and sleepiness.Those who were given a fake "negative" score, rated themselves as much sleepier, and their mood significantly worse than those who were given a fake "positive" score, and vice versa.

      Why sleep tracking may not make any sense

    1. Alerty BIK to więcej niż ochrona przed wyłudzeniem

      List of reasons (below) why I may be paying for BIK alerts (36 (if extended before 07/2024) or 42 PLN per year, although I used to pay 24 PLN).

      I may generally consider it while taking a housing loan.

  6. May 2024
    1. At Google, an AI team member said the burnout is the result of competitive pressure, shorter timelines and a lack of resources, particularly budget and headcount. Although many top tech companies have said they are redirecting resources to AI, the required headcount, especially on a rushed timeline, doesn’t always materialize. That is certainly the case at Google, the AI staffer said.
    2. A common feeling they described is burnout from immense pressure, long hours and mandates that are constantly changing. Many said their employers are looking past surveillance concerns, AI’s effect on the climate and other potential harms, all in the name of speed. Some said they or their colleagues were looking for other jobs or switching out of AI departments, due to an untenable pace.
    3. Artificial intelligence engineers at top tech companies told CNBC that the pressure to roll out AI tools at breakneck speed has come to define their jobs.
  7. Apr 2024
    1. Lesson 3: When executing a lot of requests to S3, make sure to explicitly specify the AWS region.
    2. Lesson 2: Adding a random suffix to your bucket names can enhance security.
    3. Lesson 1: Anyone who knows the name of any of your S3 buckets can ramp up your AWS bill as they like.

      The author was charged over $1300 after two days of using an S3 bucket, because some OS tool stored a default bucket name in the config, which was the same as his bucket name.

      Luckily, after everything AWS made an exception and he did not have to pay the bill.

    1. Google said Axion provides “up to 30% better performance than the fastest general-purpose Arm-based instances available in the cloud today” and “up to 50% better performance and up to 60% better energy-efficiency” than other general purpose Arm chips.
    2. Google’s new AI chip is a rival to Nvidia, and its Arm-based CPU will compete with Microsoft and Amazon
    1. Socially, we’re told, “Go work out. Go look good.” That’s a multi-player competitive game. Other people can see if I’m doing a good job or not. We’re told, “Go make money. Go buy a big house.” Again, external multiplayer competitive game. Training yourself to be happy is completely internal. There is no external progress, no external validation. You’re competing against yourself—it is a single-player game.
    1. Replacing the lock icon with a neutral indicator prevents the misunderstanding that the lock icon is associated with the trustworthiness of a page, and emphasizes that security should be the default state in Chrome. Our research has also shown that many users never understood that clicking the lock icon showed important information and controls. We think the new icon helps make permission controls and additional security information more accessible, while avoiding the misunderstandings that plague the lock icon.

      Explanation why Chrome lock icon was replaced with tune icon

    1. To address the issues of CAS, Karpenter uses a different approach. Karpenter directly interacts with the EC2 Fleet API to manage EC2 instances, bypassing the need for autoscaling groups.

      Karpenter

    2. The problem occurs when you want to move the pod to another node, in cases such as cluster rebalancing, spot interruptions, and other events. This is because the EBS volumes are zonal bound and can only be attached to EC2 instances within the zone they were originally provisioned in.This is a key limitation that CAS is not able to take into an account when provisioning a new node.

      Key limitation of CAS

    3. Since Karpenter can schedule nodes quicker, it will most often win this race and provide a new node for the pending workload. CAS will still attempt to create a new node, however will be slower and will most likely have to remove the node after some time, due to emptiness. This brings unnecessary costs to your cloud bill
    4. It’s worth mentioning that Cluster Autoscaler and Karpenter can co-exist within the same cluster.
    1. I recently chatted with a data science leader who described their company reaching this state. They couldn’t show any business impact from the past two years of their product releases, so the finance team identified a surefire way for R&D to make a business impact: laying off much of the R&D team.

      :D

    1. Who am I speaking to?What do I want?What do they care about?How can I explain it to them in terms they care about?

      Framework for message framing

    1. To recap, I think these are my personal rebase rules I follow:

      Recommendations for doing git rebase (see bullet points below annotation)

    1. Besides communication, there are other soft skills:teamworklearning mindsetorganization/time managementemotional intelligence/empathyapproachabilitypersistence/patienceconfidence

      Core soft skills in IT

    2. You can think of it as the following cycle:software engineer writes codeusers get new featuresmore users use your productscompany profits from productsSo code is just a tool to get profit.

      The core software development process

    3. 2) You will rarely get greenfield projects

      :)

    1. However, as we want to do perform the bisection automatically using as criterion ./calc.py 14 0, we run git bisect run ./calc.py 14 0

      git bisect run ./calc.py 14 0 ← example of running git bisect automatically. * If the commit is good, then the command should return 0; * If the commit is bad, then the command should return anything between 1 and 127, inclusive, except 125; * If it is not possible to tell if this commit is good or bad, then it need to be ignored, and the command should return 125.

    2. Git Bisect! It allows us to find the commit that broke something. Given a “good” commit (a commit that is not broken, created before the introduction of the bug), and a “bad” commit (a commit that certainly is broken), Git will perform a binary search until the broken commit is found.

      Git Bisect can be run manually or automatically

    3. What are the tools that comes on your mind when someone say “debug”? Let me guess: a memory leak detector (e.g. Valgrind); a profiler (e.g. GNU gprof); a function that stops your program and gives you a REPL (e.g. Python’s breakpoint and Ruby’s byebug); something that we call a “debugger” (like GDB, or something similar embedded on the IDEs); or even our old friend, the print function. So, in this text I’ll try to convince you to add Git to your debug toolbelt.

      6 differen debugging tools

    1. The same LM can be a much more or less capable agent depending on the enhancements added. The researchers created and tested four different agents built on top of GPT-4 and Anthropic’s Claude:

      While today’s LMs agents don't pose a serious risk, we should be on the lookout for improved autonomous capabilities as LMs get more capable and reliable.

    2. The latest GPT-4 model from OpenAI, which is trained on human preferences using a technique called RLHFEstimated final training run compute cost: ~$50mModel version: gpt-4-0613

      ~$50m = estimated training cost of GPT-4

    1. Additionally, students in the Codex group were more eager and excited to continue learning about programming, and felt much less stressed and discouraged during the training.

      Programming with LLM = less stress

    2. On code-authoring tasks, students in the Codex group had a significantly higher correctness score (80%) than the Baseline (44%), and overall finished the tasks significantly faster. However, on the code-modifying tasks, both groups performed similarly in terms of correctness, with the Codex group performing slightly better (66%) than the Baseline (58%).

      In a study, students who learned to code with AI made more progress during training sessions, had significantly higher correctness scores, and retained more of what they learned compared to students who didn't learn with AI.

    1. OpenAI is offering limited access to a text-to-voice generation platform it developed called Voice Engine, which can create a synthetic voice based on a 15-second clip of someone’s voice.

      OpenAI’s voice cloning AI model only needs a 15-second sample to work

  8. Mar 2024
    1. Changing the login URL is a feature we do not include in Wordfence. Though it is something that many people swear by and can help a little in certain situations it’s ultimately not very beneficial. These are the reasons why:

      Brief explanation why not to change Wordpress login URL

    1. By default, curl uses HTTP/1.1 for the http scheme and HTTP/2 for https. You can change this with flags
    1. Sekurak – 4373 Niebezpiecznik – 4171 Z3S – 3383

      Comparison of the frequency of posting by the most popular polish cybersecurity blogs: 1. Sekurak 2. Niebezpiecznik 3. Z3S

  9. Feb 2024
    1. docker init will scan your project and ask you to confirm and choose the template that best suits your application. Once you select the template, docker init asks you for some project-specific information, automatically generating the necessary Docker resources for your project.

      docker init

    1. The result? Our runtime image just got 6x smaller! Six times! From > 1.1 GB to 170 MB.

      See (above this annotation) the most optimized & CI friendly Python Docker build with Poetry (until this issue gets resolved)

    2. This final trick is not known to many as it’s rather newer compared to the other features I presented. It leverages Buildkit cache mounts, which basically instruct Buildkit to mount and manage a folder for caching reasons. The interesting thing is that such cache will persist across builds!By plugging this feature with Poetry cache (now you understand why I did want to keep caching?) we basically get a dependency cache that is re-used every time we build our project. The result we obtain is a fast dependency build phase when building the same image multiple times on the same environment.

      Combining Buildkit cache and Poetry cache

    1. At a minimum, each ADR should define the context of the decision, the decision itself, and the consequences of the decision for the project and its deliverables

      ADR sections from the example: * Title * Status * Date * Context * Decision * Consequences * Compliance * Notes

    1. We’ve (painstakingly) manually reviewed 310 live MLOps positions, advertised across various platforms in Q4 this year

      They went through 310 role descriptions and, even though role descriptions may vary significantly, they found 3 core skills that a large percentage of MLOps roles required:

      📦 Docker and Kubernetes 🐍 Python 🌥 Cloud

  10. Jan 2024
    1. W cenie 165 euro + podatek (prawie tysiąc złotych) otrzymujemy – teraz nie boję się tego napisać – zabawkę. Fakt, najeżoną techniką, ale zabawkę. W dodatku do jakiegokolwiek działania wymaga ona sporej wiedzy i umiejętności programowania. Niestety w głównej mierze „działająca z pudełka” funkcja tego urządzenia sprowadza się do uniwersalnego pilota telewizyjnego.
    1. LocalStack is a cloud service emulator that runs AWS services solely on your laptop without connecting to a remote cloud provider .

      https://www.localstack.cloud/

    1. The most common reasons I hear for hating on JIRA are:  - It's too complicated  - I spend more time tracking tickets than doing workTo that I say...You hate your micro-Manager
    1. setuptools is the most popular (at 50k packages), Poetry is second at 41k, Hatchling is third at 8.1k. Other tools to cross 500 users include Flit (4.4k), PDM (1.3k), Maturin (1.3k, build backend for Rust-based packages).

      Popularity of Python package managers in 2024

    1. Rick was a very talented developer. Rick could solve complex business logic problems and create sophisticated architectures to support his lofty designs. Rick could not solve the problem of how to work effectively on a team.

      :)

    2. I dove into the source code. Rick was right: no-one could possibly understand what Rick had created. Except for Rick. It was a reflection of the workings of his own mind. Some of it was very clever, a lot of it was copy-pasta, it was all very idiosyncratic, and it was not at all documented.

      I used to work in such a project :)

  11. Dec 2023
    1. “MLX” is more than just a technical solution; it is an innovative and user-friendly framework inspired by popular frameworks like PyTorch, Jax, and ArrayFire. It facilitates the training and deployment of AI models on Apple devices without sacrificing performance or compatibility.

      MLX (high overview)

  12. Nov 2023
    1. RUN poetry install --without dev && rm -rf $POETRY_CACHE_DIR

      The ideal way of poetry install within a Dockerfile to omit a bunch of cache that would eventually take a lot of space (which we could discover with tools like dive)

    1. Rosetta is now Generally Available for all users on macOS 13 or later. It provides faster emulation of Intel-based images on Apple Silicon. To use Rosetta, see Settings. Rosetta is enabled by default on macOS 14.1 and later.

      Tested it on my side, and poetry install of one Python project took 44 seconds instead of 2 minutes 53 seconds, so it's nearly a 4x speed increase!

  13. Oct 2023
    1. PHP would serve WordPress when it's run as a standalone Wasm application.

      php.wasm can essentially run in: 1. Wasm application (runtime) 2. Docker+Wasm container 3. Any app that embeds a Wasm runtime (e.g. Apache HTTPD) 4. Web browser

    2. WebAssembly brings true portability to the picture. You can build a binary once and run it everywhere.
    3. However, on top of the big image size, traditional containers are also bound to the architecture of the platform on which they run.
    4. Wasm container images are much smaller than the traditional ones. Even the alpine version of the php container is bigger than the Wasm one.

      php (166MB), php-alpine (30.1MB), php-wasm (5.35 MB)

    5. With WASI SDK we can build a Wasm module out of PHP's codebase, written in C. After that, it takes a very simple Dockerfile based on scratch for us to make an OCI image that can be run with Docker+Wasm.

      Building a WASM container that can be run with Docker+Wasm

    6. Docker Desktop now includes support for WebAssembly. It is implemented with a containerd shim that can run Wasm applications using a Wasm runtime called WasmEdge. This means that instead of the typical Windows or Linux containers which would run a separate process from a binary in the container image, you can now run a Wasm application in the WasmEdge runtime, mimicking a container. As a result, the container image does not need to contain OS or runtime context for the running application - a single Wasm binary suffices.

      Docker Desktop can run Wasm applications (binaries) instead of OS (Linux/Windows)

    7. We now have WebAssembly. Its technical features and portability make it possible to distribute the application, without requiring shipping OS-level dependencies and can run with strict security constraints.

      Wasm, as a next step in the evolution of server-side software infrastructure

    8. If WASM+WASI existed in 2008, we wouldn't have needed to create Docker. That's how important it is. WebAssembly on the server is the future of computing.

      Quote from one of the co-founders of Docker

    9. There are Wasm runtimes that can run outside of the browser, including traditional operating systems such as Linux, Windows and macOS. Because they cannot rely on a JavaScript engine being available they communicate with the outside world using different interfaces, such as WASI, the WebAssembly System Interface. These runtimes allow Wasm applications to interact with their host system in a similar (but not quite the same) way as POSIX. Projects like WASI SDK and wasi-libc help people compile existing POSIX-compliant applications to WebAssembly.

      Explanation on how Wasm runs on servers

    10. Browser engines integrate a Wasm virtual machine, usually called a Wasm runtime, which can run the Wasm binary instructions. There are compiler toolchains (like Emscripten) that can compile source code to the Wasm target. This allows for legacy applications to be ported to a browser and directly communicate with the JS code that runs in client-side Web applications.

      Explanation on how Wasm runs in browsers

    1. the new Docker+Wasm integration allows you to run a Wasm application alongside your Linux containers at much faster speed.

      ```bash time docker run hello-world ... 0.07s user 0.05s system 1% cpu 8.912 total time docker run --runtime=io.containerd.wasmedge.v1 --platform=wasi/wasm32 ajeetraina/hello-wasm-docker

      0.05s user 0.03s system 19% cpu 0.393 total ```

    2. Docker Desktop and CLI can now manage both Linux containers and Wasm containers side by side.
    1. The results showed that the group asked to reduce their social media use had an average 15% improvement in immune function, including fewer colds, flu, warts, and verrucae, a 50% improvement in sleep quality, and 30% fewer depressive symptoms.
    1. Adolescents who spend more than 3 hours per day on social media may be at heightened risk for mental health problems, particularly internalizing problems.
    1. We wrześniu 2023 roku w większości badanych zawodów zanotowano spadki liczby ofert pracy rok do roku Największy widoczny jest w branży IT – pracodawcy opublikowali o 52 proc. mniej ofert rok do roku
    1. How to assess durability

      Set of great questions to assess durability of the to-be purchased item

  14. Sep 2023
    1. Mandel’s system was simple — but incredibly complex from a logistical standpoint

      How to win in Lotto every time:

    1. merge queue prevents semantic merge conflicts by automating the rebase process during merge, and ensuring that the trunk branch stays “green.”

      merge queue - new GitHub feature

    1. When I create I learn. When I consume I just relax
    2. We all know the old saying practice makes perfect. The more we use a certain region of our brain, the more our brain "prioritizes" and "hones" it. That is what leads to myelin: activity induces myelination, which leads to increased strength of connectivity and efficiency along those very neurons. It’s a self-reinforcing process.
    3. The fact of the matter is that digital products make it uniquely easy to trick yourself into thinking that you’re learning when you are actually being entertained.
    4. learning must be effortful in order for it to happen

  15. Aug 2023
    1. engineering blogs focus on problems where the solution is a necessary but not sufficient part of what they do. And, ideally, they focus on problems that are complementary to scale that only the publisher of that post has.

      Core reason why companies have their engineering blogs

  16. Jul 2023
    1. cat requirements.txt | grep -E '^[^# ]' | cut -d= -f1 | xargs -n 1 poetry add

      Use poetry init to create a sample pyproject.toml, and then trigger this line to export requirements.txt into a pyproject.toml

    1. What happened here is that the file 'somefile.txt' is encoded in UTF-16, but your terminal is (probably) by default set to use UTF-8.  Printing the characters from the UTF-16 encoded text to the UTF-8 encoded terminal doesn't show an apparent problem since the UTF-16 null characters don't get represented on the terminal, but every other odd byte is just a regular ASCII character that looks identical to its UTF-8 encoding.

      The reason why grep Hello sometext.txt may result nothing when the file contains Hello World!.

      In such a case, use xxd sometext.txt to check the file in hex, and then either: - use grep: grep -aP "H\x00e\x00l\x00l\x00o\x00" * sometext.txt - or convert the file to into UTF-8: iconv -f UTF-16 -t UTF-8 sometext.txt > sometext-utf-8.txt

    1. Writing to the database may fail (e.g. it will not respond). When that happens, the process handling outbox pattern will try to resend the event after some time and try to do it until the message is correctly marked as sent in the database.

      Outbox pattern should be especially implemented when using operations such as PostgreSQL LISTEN/NOTIFY

    1. staff are more open to returning to the office if it is out of choice, rather than forced
    2. Unispace finds that nearly half (42%) of companies that mandated office returns witnessed a higher level of employee attrition than they had anticipated. And almost a third (29%) of companies enforcing office returns are struggling with recruitment. Imagine that — nearly half!
    1. python -m calendar

      So surprised that you can output a calendar view using Python

    2. python -m site, which outputs useful information about your installation

      python -m site <--- see useful information about your Python installation

    1. sudo softwareupdate -ia installs all available updates.

      Quite handy macOS command

    1. The results from both Midjourney and Stable Diffusion seem to be the most convincing and realistic if I was to judge from a human point of view and if I didn't know they were AI generated, I would believe their results.

      Midjourney & Stable Diffusion > Dall-E and Adobe Firefly

    1. For a new project, I’d just immediately start with Ruff; for existing projects, I would strongly recommend trying it as soon as you start getting annoyed about how long linting is taking in CI (or even worse, on your computer).

      Recommendation for when to use Ruff over PyLint or Flake8

  17. Jun 2023
    1. A documentation-first culture does not mean everyone is busy writing documents all day. It means that everyone appreciates the value of documenting and sharing their experiences.

      Documentation-first culture

    2. I’m a big fan of documentation. I think it’s my favorite boring thing to do after coding. It brings the business so much long-term value that every hour invested into documentation by anyone saves literally x100 productivity hours across the company.

      High five :)!

    1. All of these values, including the precious contents of the private key file, can be seen via ps when these commands are running. ps finds them via /proc/<pid>/cmdline, which is globally readable for any process ID.

      ps can read some secrets passed via CLI, especially when using --arg with jq.

      Instead, use the --rawfile parameter as noted below this annotation.

    1. Digital nomads must earn at least €2,800 per month to qualify for its new visa, around four times Portugal’s minimum wage. According to Nomad List nearly 16,000 people were remote working in Lisbon last December, where they now find themselves blamed for rocketing rents and house prices.

      Digital nomads in Portugal

    2. According to a March survey, 36 per cent of digital nomads have an annual income of between $100,000 and $250,000. Another eight per cent earn between $250,000 and one million. Attracted by these bank balances, dozens of countries have now introduced so-called “digital nomad visas” (permitting extended stays to work remotely).

      Income of digital nomads

    1. This is the script, which I’ve named docker and put before the real Docker CLI in my PATH

      Script to automatically start Docker if it's not running when we trigger a docker command

    1. There are better parameters to evaluate quality, not quantity, of the time spent staring at your screens

      Questions to ask for validating mobile apps quality

    1. The key to hacking yourself is to increase your awareness of your emotional state. When you become aware that you are angry, the anger is losing the grip it has over you. When you are angry, you are sometimes doing things you would not have done if you were not angry. (Sometimes anger is healthy, it may also be a signal to us that our boundaries have been violated.)

      Hacks around anger

    1. Examples of frontends include: pip, build, poetry, hatch, pdm, flit Examples of backends include: setuptools (>=61), poetry-core, hatchling, pdm-backend, flit-core

      Frontend and backend examples of Python's build backends

    2. pyproject.toml-based builds are the future, and they promote better practices for reliable package builds and installs. You should prefer to use them!

      setup.py is considered a "legacy" functionality these days

    3. Did you say setuptools? Yes! You may be familiar with setuptools as the thing that used your setup.py files to build packages. Setuptools now also fully supports pyproject.toml-based builds since version 61.0.0. You can do everything in pyproject.toml and no longer need setup.py or setup.cfg.

      setuptools can now utilize pyproject.toml

    1. // save to tar filedocker save nodeversion > nodeversion.tar// load from tar filedocker load < nodeversion.tar

      Saving and loading Docker images locally

    1. Developers often speak of "getting into the flow" or "being in the zone." Such statements colloquially describe the concept of flow state, a mental state in which a person performing an activity is fully immersed in a feeling of energized focus, full involvement, and enjoyment.

      One of my favourite explanations of the flow state

  18. May 2023
    1. With this dataclass, I have an explicit description of what the function returns.

      Dataclasses give you a lot more clarity of what the function returns, in comparison to returning tuples or dictionaries

    1. Today is 9th Feb. The oldest segment – segment 100 – still can’t be deleted by the 7-day topic retention policy, because the most recent message in that segment is only 5 days old. The result is that they can see messages dating back to 28th January on this topic, even though the 28th Jan is now 12 days ago. In a couple of days, all the messages in segment 100 will be old enough to exceed the retention threshold so that segment file will be deleted.

      retention.ms set to 7 days doesn't guarantee that you will only see topic messages from the last 7 days. Think of it as a threshold that the Kafka broker can use to decide when messages are eligible for being automatically deleted.

    1. Host machine: docker run -it -p 8888:8888 image:version Inside the Container : jupyter notebook --ip 0.0.0.0 --no-browser --allow-root Host machine access this url : localhost:8888/tree‌

      3 ways of running jupyter notebook in a container

  19. Apr 2023
    1. Armed with all this knowledge, we realise that we can construct an almost unlimited number of different path strings that all refer to the same directory

      See below the number of ways to define the same path on Windows

    2. UNC stands for Universal Naming Convention and describes paths that start with \\, commonly used to refer to network drives. The first segment after the \\ is the host, which can be either a named server or an IP address

      UNC paths on Windows

    3. On any Unix-derived system, a path is an admirably simple thing: if it starts with a /, it’s a path. Not so on Windows

      Paths on Windows are much more complex

    1. Em dash Works better than commas to set apart a unique idea from the main clause of a sentence
    2. The hyphen does not indicate a range of numbers, like a date range, which is the job of an en dash
    1. As I see it, there are two hard things about getting into flow: loading the state of the system / problem / abstractions into your head (i.e. filling your L1 and L2 cache with everything you need to know to work on the problem) and building momentum and confidence for yourself.

      Hard things for entering the flow

    1. To return to information overload: this means treating your "to read" pile like a river (a stream that flows past you, and from which you pluck a few choice items, here and there) instead of a bucket (which demands that you empty it). After all, you presumably don't feel overwhelmed by all the unread books in the British Library – and not because there aren't an overwhelming number of them, but because it never occurred to you that it might be your job to get through them all.

      Lesson on how to treat one's to-read list

    1. If you suddenly transform the social lives of girls, putting them onto platforms that prioritize social comparison and performance, platforms where we know that heavy users are three times more likely to be depressed than light users, might that have some impact on the mental health of girls around the world? We think so
    1. Kilka przykładów z wykorzystujących Conventional Commits:

      Examples of Conventional Commits (see the block below)

    2. Conventional Commits – type

      Conventional Commits types: - feat - fix - docs - chore - refactor - tests - perf - styles - ci - build - revert

    3. Commity o treści refactor, added XXX czy cr fixes, to smutna i nudna rzeczywistość.
    1. If you install a package with pip’s --user option, all its files will be installed in the .local directory of the current user’s home directory.

      One of the recommendations for Python multi-stage Docker builds. Thanks to pip install --user, the packages won't be spread across 3 different paths.

  20. Mar 2023
    1. rg . | fzf: Fuzzy search every line in every file

      Shortcut for searching files with ripgrep and fzf

    1. Spend your career hanging out with people you like working with, doing work you enjoy, trying new experiences, and having fewer regrets.That’s how you retire one day & make it to your deathbed a happy human.
    2. You don’t know what day it is. Monday feels like Saturday night fever.

      The #1 sign of a successful career

    1. Honestly, all the activation scripts do are:

      See the 4 steps below to understand what activating an environment in Python really does

    1. Using pex in combination with S3 for storing the pex files, we built a system where the fast path avoids the overhead of building and launching Docker images.Our system works like this: when you commit code to GitHub, the GitHub action either does a full build or a fast build depending on if your dependencies have changed since the previous deploy. We keep track of the set of dependencies specified in setup.py and requirements.txt.For a full build, we build your project dependencies into a deps.pex file and your code into a source.pex file. Both are uploaded to Dagster cloud. For a fast build we only build and upload the source.pex file.In Dagster Cloud, we may reuse an existing container or provision a new container as the code server. We download the deps.pex and source.pex files onto this code server and use them to run your code in an isolated environment.

      Fast vs full deployments

    1. "We could use this type of DB, or this other, or that other, and these are some pros and cons… And based on all these tradeoffs, I’ll use THAT type of DB."

      Example of how to recommend a single system

    2. The difference between coding and system design is the difference between retrieving and creating.Instead of finding (or “retrieving”) a solution, you are creating a solution. In this way, coding is akin to a science, while system design is more like an art.
    1. there’s the famous 2019 paper by Allcott et al. which found that having people deactivate Facebook for a while made them happier, while also making them socialize more and worry less about politics
    2. [There’s also] a big new study from Cambridge University, in which researchers looked at 84,000 people…and found that social media was strongly associated with worse mental health during certain sensitive life periods, including for girls ages 11 to 13…One explanation is that teenagers (and teenage girls in particular) are uniquely sensitive to the judgment of friends, teachers, and the digital crowd.
    3. First, they’re a distraction — the rise of smartphones was also the rise of “phubbing”, i.e. when people go on their phones instead of paying attention to the people around them.

      phubbing

    4. Text is a highly attenuated medium — it’s slow and cumbersome, and an ocean of nuance and tone and emotion is lost. Even video chat is a highly incomplete substitute for physical interaction. A phone doesn’t allow you to experience the nearby physical presence of another living, breathing body — something that we spent untold eons evolving to be accustomed to. And of course that’s even before mentioning activities like sex that are far better when physical contact is involved.

      It turns out that text/video chatting is not really a replacement for social interactions, yet, it leads to social isolation

    5. Yglesias argues that the progressive politics of the 2010s encouraged progressives to think of everything in catastrophic terms, making them less happy.
    1. You can freely replace SageMaker services with other components as your project grows and potentially outgrows SageMaker.

    1. This could be because the size can be misleading, there is on disk size, push/pull payload size, and sum of ungzipped tars. The size of the ungzipped tars is often used to represent the size of the image in Docker but the actual size on disk is dependent on the graph driver. From the registry perspective, the sum of the gzipped layers is most important because it represent what the registry is storing and what needs to be transferred.

      Docker image size on a local drive will be different

    1. Ultimately, after researching how we can overcome some inconveniences in Kubeflow, we decided to continue using it. Even though the UI could use some improvements in terms of clarity, we didn’t want to give up the advantages of configured CI/CD and containerization, which allowed us to use different environments. Also, for our projects, it is convenient to develop each ML pipeline in separate Git repositories.

      Kubeflow sounds like the most feature rich solution, whose main con is its UI and the setup process

    2. So, let’s sum up the pros and cons of each tool:

      Summary of pros/cons of Airflow, Kubeflow and Prefect

    3. The airflow environment must have all the libraries that are being imported in all DAGs. Without using containerization all Airflow pipelines are launched within the same environment. This leads to limitations in using exotic libraries or conflicting module versions for different projects.

      Main con of Airflow

    4. Prefect is a comparatively new but promising orchestration tool that appeared in 2018. The tool positions itself as a replacement for Airflow, featuring greater flexibility and simplicity. It is an open-source project; however, there is a paid cloud version to track workflows.
    5. Airflow has been one of the most popular orchestrating tools for several years.

      (see the graph above)

    6. An orchestration tool usually doesn’t do the hard work of translating and processing data itself, but tells other systems and frameworks what to do and monitors the status of the execution.

      Responsibility of the orchestration tool

    7. To this day, the field of machine learning does not have a single generally accepted approach to solving problems in terms of practical use of models.

      Business ¯\_(ツ)_/¯

    1. Well, in short, with iterators, the flow of information is one-way only. When you have an iterator, all you can really do call the __next__ method to get the very next value to be yielded. In contrast, the flow of information with generators is bidirectional: you can send information back into the generator via the send method.
      • Iterator ← one-way communication (can only yield stuff)
      • Generator ← bidirectional communication (can also accept information via the send method)
    1. So why aren't more people using Nim? I don't know! It's the closest thing to a perfect language that I've used by far.

      Nim sounds as the most ideal language when comparing to Python, Rust, Julia, C#, Swift, C

    1. Time to dive a little deeper to see what information the barcodes actually contain. For this I will break down the previously extracted information into smaller pieces.

      Information contained within boarding pass barcodes

    1. "For this campaign, we surveyed 930 Americans to explore their retirement plans. Among them, 16% were retired, 22% were still working, and 62% were retirees who had returned to work."So, 149 of those surveyed were retired. Of those 149, 25 (1 in 6) are considering returning to work. 13 of those want remote positions.
    1. Watching a TED talk gets you high for an hour. Watching an inspiring movie gets you thinking for a day. Reading a book keeps us motivated for about a week.
    1. Instagram was founded in 2010. The iPhone 4 was released then too—the first smartphone with a front-facing camera. In 2012 Facebook bought Instagram, and that’s the year that its user base exploded. By 2015, it was becoming normal for 12-year-old girls to spend hours each day taking selfies

      Main cause of global depression

    1. ServingRuntime - Templates for Pods that can serve one or more particular model formats. There are three "built in" runtimes that cover the out-of-the-box model types, custom runtimes can be defined by creating additional ones.

      ServingRuntime

    1. cluster with 4096 IP addresses can deploy at most 1024 models assuming each InferenceService has 4 pods on average (two transformer replicas and two predictor replicas).

      Kubernetes clusters have a maximum IP address limitation

    2. According to Kubernetes best practice, a node shouldn't run more than 100 pods.
    3. Each model’s resource overhead is 1CPU and 1 GB memory. Deploying many models using the current approach will quickly use up a cluster's computing resource. With Multi-model serving, these models can be loaded in one InferenceService, then each model's average overhead is 0.1 CPU and 0.1GB memory.

      If I am not mistaken, the multi-model approach reduces the size by 90% in this case

    4. Multi-model serving is designed to address three types of limitations KServe will run into

      Benefits of multi-model serving

    5. While you get the benefit of better inference accuracy and data privacy by building models for each use case, it is more challenging to deploy thousands to hundreds of thousands of models on a Kubernetes cluster.

      With more separation, comes the problem of distribution

    1. I could be super-happy with any of Fira Code Retina, Hack, JetBrains Mono, or Inconsolata.

      Recommended fonts for 14" MacBook

    1. depending on how smart the framework is, you might find yourself installing Conda packages over and over again on every run. This is inefficient, even when using a faster installer like Mamba.
    2. there’s the bootstrapping problem: depending on the framework you’re using, you might need to install Conda and the framework driver before you can get anything going. A Docker image would come prepackaged with both, in addition to your code and its dependencies. So even if your framework supports Conda directly, you might want to use Docker anyway.
    3. Mlflow supports both Conda and Docker-based projects.
    4. The only thing that will depend on the host operating system is glibc, pretty much everything else will be packaged by Conda. So a pinned environment.yml or conda-lock.yml file is a reasonable alternative to a Docker image as far as having consistent dependencies.

      Conda can be a sufficient alternative to Docker

    5. To summarize, for the use case of Python development environments, here’s how you might choose alternatives to Docker:

      (see table below)

    6. Conda packages everything but the standard C library, from C libraries to the Python interpreter to command-line tools to compilers.
    1. So the short answer is to pick rebase or merge based on what you want your history to look like.

      Quick summary of rebase vs merge

    1. In summary, motivation is trying to feel like doing stuff. Discipline is doing it even if you don’t feel like it.

      Motivation vs Discipline

    1. response times, error rates, and request rates

      Sample metrics to monitor

    2. You can use authentication mechanisms such as OAuth2, JSON Web Tokens (JWT), or HTTP Basic Authentication to ensure that only authorized users or applications can access your API.
    3. In this example, we’ve defined an API endpoint called /predict_image that accepts a file upload using FastAPI's UploadFile type. When a client sends an image file to this endpoint, the file is read and its contents are passed to a preprocessing function that prepares the image for input into the model. Once the image has been preprocessed, the model can make a prediction on it, and the result can be returned to the client as a JSON response.

      Example above shows how to upload an image to an API endpoint with FastAPI.

      Example below is a bit more complex.

    4. For example, if you are using TensorFlow, you might save your model as a .h5 file using the Keras API. If you are using PyTorch, you might save your model as a .pt file using the torch.save() function. By saving your model as a file, you can easily load it into a deployment environment (such as FastAPI) and use it to make predictions on new images
  21. Feb 2023
    1. Regular Shell Commands

      Some of my favourite aliases: * 1. (already configured in my ohmyzsh) * 4. * 6. (already configured in my ohmyzsh) * 13. * 17.

    2. The set -x command is used to turn on debugging in a shell script and can also be used to test bash aliases. When set -x is used, the command and its arguments are printed to the standard error stream before the command is executed. This can be useful for testing aliases because it lets you see exactly what command is running and with what arguments.

      set -x

    3. 6. A function that checks if a website is up or down
    4. 5. A function that allows using sudo command without having to type a password every time
    5. Kubernetes Aliases

      Some of my favourite k8s aliases: * 2. * 3.

    6. Mac User Aliases

      Some of my favourite Mac aliases: * 1. * 11.

    7. A much more elegant approach, however, is to add them to an ~/.aliases like file and then source this file in your respective profile file assource ~/.aliases

      More elegant way to list aliases

    1. The way to get new ideas is to notice anomalies: what seems strange, or missing, or broken? You can see anomalies in everyday life (much of standup comedy is based on this), but the best place to look for them is at the frontiers of knowledge.Knowledge grows fractally. From a distance its edges look smooth, but when you learn enough to get close to one, you'll notice it's full of gaps. These gaps will seem obvious; it will seem inexplicable that no one has tried x or wondered about y. In the best case, exploring such gaps yields whole new fractal buds.

      Way to get new ideas

    1. A huge percentage of the data that gets processed is less than 24 hours old. By the time data gets to be a week old, it is probably 20 times less likely to be queried than from the most recent day. After a month, data mostly just sits there.