2,008 Matching Annotations
  1. May 2022
    1. As of today, the Docker Engine is to be intended as an open source software for Linux, while Docker Desktop is to be intended as the freemium product of the Docker, Inc. company for Mac and Windows platforms. From Docker's product page: "Docker Desktop includes Docker Engine, Docker CLI client, Docker Build/BuildKit, Docker Compose, Docker Content Trust, Kubernetes, Docker Scan, and Credential Helper".

      About Docker Engine and Docker Desktop

    2. The diagram below tries to summarise the situation as of today, and most importantly to clarify the relationships between the various moving parts.

      Containers (the backend):

    1. If you want your voice to be heard (and also improve the usability of your text) you have to design your document for “skim-ability”. You do this by providing anchor points that allow the user to gauge the content without actually reading it. You want the outline and key arguments to keep standing out in the final version of the document.

      This is why I like using bullet points

    2. When writing an article, I generally visualize a concrete person as representative of the audience, that I am directing this text towards.

      I also tend to write to my old self

    3. Writing is generally a great way to learn, but one has to realize that you are doing it. Learning is a slow process and requires patience. It is not helped much by agonizing in front of a screen, trying to squeeze out another sentence. Doing more research on the topic by reading a book, blog or paper and taking notes may be a better time investment.

      Writing is a great learning method

    4. The realization that you don’t have the complete message in your head, will often only become apparent while writing. This surfaces as inability to find a good punch-line or to express yourself clearly. In fact, writing is a great test to see if you have a good understanding of a topic, and have a firm grasp on the vocabulary of the domain.
    1. an asshole test. You see who turns into an asshole under pressure and they don't make it to the next round".

      Asshole test: 60 minutes, 8 people, no right answer, but checking who turns into an asshole under pressure

    1. A 20-year age difference (for example, from 20 to 40, or from 30 to 50 years old) will, on average, correspond to reading 30 WPM slower, meaning that a 50-year old user will need about 11% more time than a 30-year old user to read the same text.
    2. Users’ age had a strong impact on their reading speed, which dropped by 1.5 WPM for each year of age.
    1. Stop using TODO for everything in your comments that requires you to do something later.

      Possible alternatives of TODO: * FIXME - something is broken * HACK/OPTIMIZE - the code is suboptimal and should be refactored * BUG - there is a bug in the code * CHECKME/REVIEW - the code needs to be reviewed * DOCME - the code needs to be documented (either in codebase or external documentation) * TESTME - the specified code needs to be tested or that tests need to be written for that selection of code

    1. Create the new empty table Write to both old and new table Copy data (in chunks) from old to new Validate consistency Switch reads to new table Stop writes to the old table Cleanup old table

      7 steps required while migrating to a new table

    1. Without accounting for what we install or add inside, the base python:3.8.6-buster weighs 882MB vs 113MB for the slim version. Of course it's at the expense of many tools such as build toolchains3 but you probably don't need them in your production image.4 Your ops teams should be happier with these lighter images: less attack surface, less code that can break, less transfer time, less disk space used, ... And our Dockerfile is still readable so it should be easy to maintain.

      See sample Dockerfile above this annotation (below there is a version tweaked even further)

    2. scratch is a special empty image with no operating system.
    1. Overall, having spent a significant amount of time building this project, scaling it up to the size it’s at now, as well as analysing the data, the main conclusion is that it is not worth building your own solution, and investing this much time. When I first started building this project 3 years ago, I expected to learn way more surprising and interesting facts. There were some, and it’s super interesting to look through those graphs, however retrospectively, it did not justify the hundreds of hours I invested in this project.I’ll likely continue tracking my mood, as well as a few other key metrics, however will significantly reduce the amount of time I invest in it.

      Words of the author of https://krausefx.com//blog/how-i-put-my-whole-life-into-a-single-database

      It seems as if excessive personal data tracking is not worth it

    1. 45% less time spent in video & audio calls that day

      I can relate with the author. Spending time in a high number of video/audio calls can drastically decrease my mood

    1. A properly formed Git commit subject line should always be able to complete the following sentence:If applied, this commit will your subject line hereFor example:If applied, this commit will refactor subsystem X for readability

      An example how to always aim for imperative commits

    2. Git itself uses the imperative whenever it creates a commit on your behalf.For example, the default message created when using git merge reads:Merge branch 'myfeature'

      Using imperative mood in subject line of git commits

    3. Commit messages with bodies are not so easy to write with the -m option. You’re better off writing the message in a proper text editor.

      I've tested it on Windows, and in PowerShell or Git Bash it is as simple as:

      ```console git commit -m "Subject line<ENTER>

      body line 1 body line 2"<ENTER> ```

      However, it does not work in CMD.exe (pressing [ENTER] will not move to the next line)

    4. Firstly, not every commit requires both a subject and a body. Sometimes a single line is fine, especially when the change is so simple that no further context is necessary.

      Not every commit requires a body part

    5. Summarize changes in around 50 characters or less More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of the commit and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); various tools like `log`, `shortlog` and `rebase` can get confused if you run the two together. Explain the problem that this commit is solving. Focus on why you are making this change as opposed to how (the code explains that). Are there side effects or other unintuitive consequences of this change? Here's the place to explain them. Further paragraphs come after blank lines. - Bullet points are okay, too - Typically a hyphen or asterisk is used for the bullet, preceded by a single space, with blank lines in between, but conventions vary here If you use an issue tracker, put references to them at the bottom, like this: Resolves: #123 See also: #456, #789

      Example of a great commit message

    6. Separate subject from body with a blank lineLimit the subject line to 50 charactersCapitalize the subject lineDo not end the subject line with a periodUse the imperative mood in the subject lineWrap the body at 72 charactersUse the body to explain what and why vs. how

      7 rules of great commit messages

    7. What may be a hassle at first soon becomes habit, and eventually a source of pride and productivity for all involved.
    8. Commit messages can do exactly that and as a result, a commit message shows whether a developer is a good collaborator.
    9. A diff will tell you what changed, but only the commit message can properly tell you why.

      Commit messages are important

    10. Look at Spring Boot, or any repository managed by Tim Pope.
  2. Apr 2022
    1. To access your Linux files in Windows, open the Ubuntu terminal and type explorer.exe . (include the punctuation mark). This will open the linux directory in Windows Explorer, with the WSL prefix “\wsl$\Ubuntu-18.04\home\your-username”.Now, you’ll notice that Windows treats your Linux environment as a second network.
      • Accessing WSL files from Windows in the WSL terminal: explorer.exe .
      • Accessing Windows files from WSL terminal: cd /mnt
    1. The eagle-eyed among you may notice that there isn't a .git directory in the app-example-2 working tree. Instead, there's a .git file. This file points to the git directory in the original clone, and it means that all your usual git commands work inside the app-example-2 directory, as well as in the original app-example directory.

      The working of git worktree

    2. Anecdotally, I've found IDEs get much less "confused" if you use their built-in support for switching git branches, instead of changing them from the command line and waiting for the IDE to "notice" the changes.
    1. Reading puts candidates at ease compared to writing code.  As an interviewer, stress is your enemy because it raises adrenaline which lowers IQ by several points, causing you to miss good candidates.   Candidates prefer reading partly because they are relieved to not have to write code, but also because the interviewer can easily adjust the reading questions to accommodate for the candidate’s skill.
    1. Finally, to make our terminal really pretty, we need to customize the prompt. There's lots of options out there for this, but the most popular one seems to be ohmyzsh for Bash and oh-my-posh for PowerShell. I'm not a huge fan of these because in my experience they slow down the terminal to a point which makes me frustrated to use them, and since they are separate solutions for each environment they must be configured separately.

      I agree. After using oh-my-posh in almost every Windows console, I have finally decided to make a switch to Starship

    1. We asked them to collect searches they performed in their normal workday, and then evaluate their performance on Google, Bing, Neeva, You.com, Kagi, and DuckDuckGo.Here are more examples of programming queries where Google is lagging behind.

      For example, searching for "python throw exception" will show better results in Neeva or You.com than in Google

    1. In the previous version, using the standard library, once the data is loaded we no longer to keep the file open. With this API the file has to stay open because the JSON parser is reading from the file on demand, as we iterate over the records.

      For ijson.items(), the peak tracked memory usage was 3.6 MiB for a large JSON, instead of 124.7 MiB as for the standard json.load()

    2. One common solution is streaming parsing, aka lazy parsing, iterative parsing, or chunked processing.

      Solution for processing large JSON files in Python

    3. Then, if the string can be represented as ASCII, only one byte of memory is used per character. If the string uses more extended characters, it might end up using as many as 4 bytes per character. We can see how much memory an object needs using sys.getsizeof()

      "a" takes less bytes than "❄", which takes less bytes than "💵"

    1. I tried PuTTY, which I’d heard was the good SSH thing on Windows, but it’s…​ not good, at all. PowerShell does come with an SSH client, so once you have this working with a reasonable terminal, you can use SSH as normal.
    2. I sometimes wondered why the VS Code team put so much effort into the built-in terminal inside the editor. I tried it once on Linux and never touched it again, because the terminal window I had right next to my editor was just massively better in every way. Having used Windows terminals for a while, I now fully understand why it’s there.

      VS Code terminal is not as efficient on Linux

    3. Scoop seems to be the best of the bunch, so far?

      Scoop seems to be the best package manager

    4. They just automate the process of going to the website, downloading an installer and then running it - which is slightly better than doing it yourself.

      Windows package managers are unlike Linux ones

    5. It’s got less useful stuff in it than most Linux distro “app stores” and is utterly miniscule compared to the Debian repositories, which have ~60,000 packages in, or Arch’s AUR, with 73,000 (these counts include the whole Linux OS, though, with is installed using the same package manager).

      Windows Store

    6. if you want to add persistent environment variables to your currently running shell, you should put a setx command in your $profile file and then reload it: . $profile - or maybe run myvar="value" && setx %myvar% "value", or something similar.

      Storing persisent environment variables on Windows

    7. C:\Users\DuncanLock\Documents\PowerShell\Microsoft.PowerShell_profile.ps1~/.bashrc


    8. PowerShell does have a ~ alias for your home folder, and cd ~ works!
    9. You can get Windows versions of most of the standard *nix userland utils, which seem to work OK with PowerShell

      scoop install coreutils

    10. Desktop Linux is often criticized for this, but Windows is much worse, somehow! It’s really inconsistent. Half of it is “new” UI and half of it is old Win32/GDI type UI - just as bad as KDE/GTK - except worse, because you can’t configure them to use the same theme. Also, when you install a Linux distribution, it’ll start off either all KDE or all GTK, or whatever - but with Windows you’re stuck with a random mix of both right from the start.

      Windows is a mess...

    1. Most companies are not prepared to pay for a staging environment identical to production

      Keeping staging environment has its cost

    1. With personal stability (job, hobbies, network, etc.) and age you care less and less what people think and put less emphasis on what people think about you. - At the same time, you judge people less and less because, what does your judgment really matter? And when you judge people less you judge yourself less. - A lot of awkwardness is a function of trying hard to fit a social shape and failing. When you let go of most social expectations of yourself and others, in my experience mostly because of age and stability, you lose the awkwardness.
    1. Lists promise value quickly A normal essay promises a payoff only when you finish it. But lists promise at least some value almost immediately.

    1. Let’s look at a recent paper by Xia, Bao, Lo, Xing, Hassan, & Li entitled Measuring Program Comprehension: A Large-Scale Field Study with Professionals and published in IEEE Transactions on Software Engineering, 44, 951-976, 2018. This paper is quite interesting in that it describes in great details how the figures are obtained. And it says that Comprehension took on average ~58%.

      Developers spend most of their time figuring the system out

    1. # Input Input: 123, Output: Input: 121, Output: Input: 111, Output: Input: 123454321, Output: Input 123123, Output: # Instruction Output true if input is a palindrome # Output Input: 123, Output: false Input: 121, Output: true Input: 111, Output: true Input: 123454321, Output: true Input 123123, Output: false

      Example of using GPT-3 for programming

    1. Using named arguments is nice for languages that support it, but this is not always a possibility. Even in Python, where time.sleep is defined with a single argument named secs, we can’t call sleep(secs=300) due to implementation reasons. In that case, we can give the value a name instead.Instead of this:time.sleep(300)Do this:sleep_seconds = 300 time.sleep(sleep_seconds)Now the code is unambiguous, and readable without having to consult the documentation.

      Putting units in variable names

    1. defaults write NSGlobalDomain WebKitDeveloperExtras -bool true defaults write -g WebKitDeveloperExtras -bool YES

      Type these 2 commands to macOS terminal to be able to "Inspect Elements" in a Safari Web Inspector

    1. Core: like an engine in a car. The product is meaningless without it.Necessary: like a car’s spare wheel. It’s rarely used but when needed, its function decides the success of the system.Added value: like a car’s cup-holder. It’s nice to have but the product is perfectly usable without it.Unique Selling Point: the main reason people should buy your product instead of your rivals. For example, your car is the best off-road vehicle.

      4 categories of software features

  3. Mar 2022
    1. Here is an architecture diagram of the devices I’m currently using

      Architecture of recommended Smart Home products (March 2022)

    1. Have you ever built an image only to realize that you actually need it on a user account other than root, requiring you to rebuild the image again in rootless mode? Or have you built an image on one machine but run containers on the image using multiple different machines? Now you need to set up an account on a registry, push the image to the registry, Secure Shell (SSH) to each device you want to run the image on, and then pull the image. The podman image scp command solves both of these annoying scenarios as quickly as they occur.

      Podman 4.0 can transfer container images without a registry.

      For example: * You can copy a root image to a non-root account:

      $ podman image scp root@localhost::IMAGE USER@localhost:: * Or copy an image from one machine to another with this command:

      $ podman image scp me@ you@

    1. an onion address is a promise and a mechanism to assure that you are taking seriously the needs of the people who use Tor.

      Why offer an Onion Address rather than just encourage browsing-over-Tor

    1. As mentioned earlier, PATCH requests should apply partial updates to a resource, whereas PUT replaces an existing resource entirely. It's usually a good idea to design updates around PATCH requests

      Prefer PATCH over PUT

    2. Aside from using HTTP status codes that indicate the outcome of the request (success or error), when returning errors, always use a standardized error response that includes more detailed information on what went wrong.

      For example: ``` // Request => GET /users/4TL011ax

      // Response <= 404 Not Found { "code": "user/not_found", "message": "A user with the ID 4TL011ax could not be found." } ```

    3. https://api.averagecompany.com/v1/health https://api.averagecompany.com/health?api_version=1.0

      2 examples of versioning APIs

    4. When dealing with date and time, APIs should always return ISO 8601-formatted strings.
    1. So, to summarize, don't ask "Any Java experts around?", but rather ask "How do I do [problem] with Java and [other relevant info]?"

      Quick advice on forming questions

    2. You're asking people to take responsibility. You're questioning people's confidence in their abilities. You're also unnecessarily walling other people out. I often answer questions related to languages or libraries I have never used, because the answers are (in a programmer kind of way) common sense.

      Don't ask unspecified questions like "Any Java experts around?"

    1. This isn’t MAIN_PRIORITY, so we aren’t going to do it until at least ESTIMATED_DONE_DATE. Right now our priority is MAIN_PRIORITY because of ONE_SENTENCE_JUSTIFICATION, and this is 100% our shipping focus. I agree this sounds like a really useful feature - once we finish MAIN_PRIORITY, should we consider dropping SECOND_PRIORITY and do it?

      Templates for saying NO

    1. for debugging purposes, a good combination is --lf --trace which would start a debug session with pdb at the beginning of the last test that failed:

      pytest --lf --trace

    2. pytest -l

      Show values of local variables in the output with -l

    3. If you start pytest with --pdb, it will start a pdb debugging session right after an exception is raised in your test. Most of the time this is not particularly useful as you might want to inspect each line of code before the raised exception.

      The --pdb option for pytest

    4. pytest --lf

      Run the last failed test only with --lf

      Run all tests, but run the last failed ones first with --ff

    5. pytest -x

      Exiting on the 1st error with -x

    6. pytest --collect-only

      Collecting Pytests (not running them)

    7. pytest test_validator.py::test_regular_email_validates

      Example of running just one test (test_regular_email_validates) from test_validator.py

    8. Apart from shared fixtures you could place external hooks and plugins or modifiers for the PATH used by pytest to discover tests and implementation code.

      Additional things to store in conftest.py

    9. pytest can read its project-specific configuration from one of these files: pytest.ini tox.ini setup.cfg

      3 options for configuring pytest

    10. To have the fixture actually be used by one of your test, you simply add the fixture’s name as an argument


      ```python ​import​ pytest

      @pytest.fixture() def database_environment(): setup_database() yield teardown_database()

      def test_world(database_environment): assert 1 == 1 ```

    1. But the problem with Poetry is arguably down to the way Docker’s build works: Dockerfiles are essentially glorified shell scripts, and the build system semantic units are files and complete command runs. There is no way in a normal Docker build to access the actually relevant semantic information: in a better build system, you’d only re-install the changed dependencies, not reinstall all dependencies anytime the list changed. Hopefully someday a better build system will eventually replace the Docker default. Until then, it’s square pegs into round holes.

      Problem with Poetry/Docker

    2. Third, you can use poetry-dynamic-versioning, a plug-in for Poetry that uses Git tags instead of pyproject.toml to set your application’s version. That way you won’t have to edit pyproject.toml to update the version. This seems appealing until you realize you now need to copy .git into your Docker build, which has its own downsides, like larger images unless you’re using multi-stage builds.

      Approach of using poetry-dynamic-versioning plugin

    3. But if you’re doing some sort of continuous deployment process where you’re continuously updating the version field, your Docker builds are going to be slow.

      Be careful when updating the version field of pyproject.toml around Docker

    1. VCR.py works primarily via the @vcr decorator. You can import this decorator by writing: import vcr.

      How VCR.py works

    2. The VCR.py library records the responses from HTTP requests made within your unit tests. The first time you run your tests using VCR.py is like any previous run. But the after VCR.py has had the chance to run once and record, all subsequent tests are:Fast! No more waiting for slow HTTP requests and responses in your tests.Deterministic. Every test is repeatable since they run off of previously recorded responses.Offline-capable! Every test can now run offline.

      VCR.py library to speed up Python HTTP tests

    1.  75% of people in the U.S. never tweet.On an average weeknight in January, just 1% of U.S. adults watched primetime Fox News (2.2 million). 0.5% tuned into MSNBC (1.15 million).Nearly three times more Americans (56%) donated to charities during the pandemic than typically give money to politicians and parties (21%).
    1. The ENTRYPOINT specifies a command that will always be executed when the container starts. The CMD specifies arguments that will be fed to the ENTRYPOINT.

      Another great comparison of ENTRYPOINT and CMD command

    1. DevOps is an interesting case study for understanding MLOps for a number of reasons: It underscores the long period of transformation required for enterprise adoption.It shows how the movement is comprised of both tooling advances as well as shifts in cultural mindset at organizations. Both must march forward hand-in-hand.It highlights the emerging need for practitioners with cross-functional skills and expertise. Silos be damned.

      3 things MLOps can learn from DevOps

    2. MLOps today is in a very messy state with regards to tooling, practices, and standards. However, this is to be expected given that we are still in the early phases of broader enterprise machine learning adoption. As this transformation continues over the coming years, expect the dust to settle while ML-driven value becomes more widespread.

      State of MLOps in March 2022

  4. Feb 2022
    1. LXC, is a serious contender to virtual machines. So, if you are developing a Linux application or working with servers, and need a real Linux environment, LXC should be your go-to. Docker is a complete solution to distribute applications and is particularly loved by developers. Docker solved the local developer configuration tantrum and became a key component in the CI/CD pipeline because it provides isolation between the workload and reproducible environment.

      LXC vs Docker

    1. # line containing 'cake' but not 'at' # same as: grep 'cake' table.txt | grep -v 'at' # with PCRE: grep -P '^(?!.*at).*cake' table.txt $ awk '/cake/ && !/at/' table.txt blue cake mug shirt -7

      It should be easier to use awk over bash, especiallly for AND conditions.

      For example, for "line containing cake but not at": * grep: grep 'cake' table.txt | grep -v 'at' * grep with PCRE: grep -P '^(?!.*at).*cake' table.txt * awk: awk '/cake/ && !/at/' table.txt

    1. Jeżeli masz dylemat czy użyć CMD, czy ENTRYPOINT jako punkt startowy twojego kontenera, odpowiedz sobie na następujące pytanie.Czy zawsze moje polecenie MUSI się wykonać? Jeśli odpowiedź brzmi tak, użyj ENTRYPOINT. Co więcej, jeśli potrzebujesz przekazać dodatkowe parametry, które mogą być nadpisane podczas uruchomienia kontenera — użyj również instrukcji CMD.

      How to simply decide if to use CMD or ENTRYPOINT in a Dockerfile

    1. More backlinks to you and your work: Being the teammate that contributes to the system of knowledge shared shows how much you care about the success of the organization. And, it does it in a way that helps you have more documented and attributable credibility for the value you create within your organization.

      Why it is important to write clear documentation in an organization

    2. Writing supplants meetings: When there is good documentation around a meeting (briefs, meeting notes, etc.), meetings can be leaner and more productive because people don’t have to be in the room to know what’s happening. So, only those who are actively contributing to the discussion need to attend.
    3. Without meeting notes and documentation, companies become reliant on unreliable verbal accounts, 1:1 updates, and needing to be in the room to get things done.
    4. Scales everyone’s knowledge: think about how many people you interact with on a given work day. I’m talking about the real human to human kind where you relay your ideas to others. It’s probably in the 5-20 range. If you put those same ideas into a doc or an email, your distribution goes to infinity. Anyone can read it.

      Writing can greatly scale up

    1. == and != for string comparison -eq, -ne, -gt, -lt, -le -ge for numerical comparison

      Comparison syntax in Bash

    2. > will overwrite the current contents of the file, if the file already exists. If you want to append lines instead, use >>

      > - overwrites text

      >> - appends text

    3. The syntax for “redirecting” some output to stderr is >&2. > means “pipe stdout into” whatever is on the right, which could be a file, etc., and &2 is a reference to “file descriptor #2” which is stderr.

      Using stderr. On the other hand, >&1 is for stdout

    4. single quotes, which don’t expand variables

      In Bash, double quotes ("") expand variables, whereas single quotes ('') don't

    5. This only works if you happen to have Bash installed at /bin/bash. Depending on the operating system and distribution of the person running your script, that might not necessarily be true! It’s better to use env, a program that finds an executable on the user’s PATH and runs it.

      Shebang tip: instead of ```



      !/usr/bin/env bash

      alternatively, you can replace `bash` with `python`, `ruby`, etc. and later chmod it and run it: $ chmod +x my-script.sh $ ./my-script.sh ```

  5. Jan 2022
    1. It isn't about note taking, it's about supporting a brain that simply isn't capable of retaining the level of information that we have to deal with. The ultimate note taking device is actually an augmented human brain that has perfect recollection and organisation.

      What note taking is about

    1. In Japan, the country where the first emoji sets originated, red is traditionally used to represent increases in the value of a stock. Meanwhile, green is used to represent decreases in stock value.

      Reason why chart increasing emoji 📈 is red

    1. the curse of knowledge. It’s a simple but devastating effect: Once we know something, it’s very difficult to imagine not knowing it, or to take the perspective of someone who doesn't.

      The curse of knowledge

    1. Adopting Kubernetes-native environments ensures true portability for the hybrid cloud. However, we also need a Kubernetes-native framework to provide the "glue" for applications to seamlessly integrate with Kubernetes and its services. Without application portability, the hybrid cloud is relegated to an environment-only benefit. That framework is Quarkus.

      Quarkus framework

    2. Kubernetes-native is a specialization of cloud-native, and not divorced from what cloud native defines. Whereas a cloud-native application is intended for the cloud, a Kubernetes-native application is designed and built for Kubernetes.

      Kubernetes-native application

    3. According to Wilder, a cloud-native application is any application that was architected to take full advantage of cloud platforms. These applications: Use cloud platform services. Scale horizontally. Scale automatically, using proactive and reactive actions. Handle node and transient failures without degrading. Feature non-blocking asynchronous communication in a loosely coupled architecture.

      Cloud-native applications

    1. two main problems with framing decisions and policies in terms of usefulness: (1) being useful is not always to our own benefit – sometimes, we are being used as a means to someone else’s end, and we end up miserable as a result; and (2) the lenses themselves of usefulness and uselessness can obscure our view of the good life.

      2 main problems of usefulness

    1. The script is in batch with some portions of powershell. The base code is fairly simple and most of it came from Googling ".bat transfer files" followed by ".bat how to only transfer certain file types" etc. The trick was making it work with my office, knowing where to scan for new files, knowing where not to scan due to lag (seriously, if you have a folder with 200,000 .txt files that crap will severally slow down your scans. Better to move it manually and then change the script to omit that folder from future searches)
    2. It essentially scans the on-site drive for any new files, generates hash values for them, transfers them to the Cloud, then generates hash values again for fidelity (in court you have to prove digital evidence hasn't been tampered with).

      Script to automate an 8 hour job: The firm gets thousands of digital documents, photos, etc on a daily basis. All of this goes on a local drive. My job is to transfer all of these files to the Cloud and then verify their fidelity.

    1. Salesforce has a unique use case where they need to serve 100K-500K models because the Salesforce Einstein product builds models for every customer. Their system serves multiple models in each ML serving framework container. To avoid the noisy neighbor problem and prevent some containers from taking significantly more load than others, they use shuffle sharding [8] to assign models to containers. I won’t go into the details and I recommend watching their excellent presentation in [3].

      Case of Salesforce serving 100K-500K ML models with the use of shuffle sharding

    2. Batching predictions can be especially beneficial when running neural networks on GPUs since batching takes better advantage of the hardware.

      Barching predictions

    3. Inference Service — provides the serving API. Clients can send requests to different routes to get predictions from different models. The Inference Service unifies serving logic across models and provides easier interaction with other internal services. As a result, data scientists don’t need to take on those concerns. Also, the Inference Service calls out to ML serving containers to obtain model predictions. That way, the Inference Service can focus on I/O-bound operations while the model serving frameworks focus on compute-bound operations. Each set of services can be scaled independently based on their unique performance characteristics.

      Responsibilities of Inference Service

    4. Provide a model config file with the model’s input features, the model location, what it needs to run (like a reference to a Docker image), CPU & memory requests, and other relevant information.

      Contents of a model config file

    5. what changes when you need to deploy hundreds to thousands of online models? The TLDR: much more automation and standardization.

      MLOps focuses deeply on automation and standardization

    1. “Shadow Mode” or “Dark Launch” as Google calls it is a technique where production traffic and data is run through a newly deployed version of a service or machine learning model, without that service or model actually returning the response or prediction to customers/other systems. Instead, the old version of the service or model continues to serve responses or predictions, and the new version’s results are merely captured and stored for analysis.

      Shadow mode

    1. you can also mount different FastAPI applications within the FastAPI application. This would mean that every sub-FastAPI application would have its docs, would run independent of other applications, and will handle its path-specific requests. To mount this, simply create a master application and sub-application file. Now, import the app object from the sub-application file to the master application file and pass this object directly to the mount function of the master application object.

      It's possible to mount FastAPI applications within a FastAPI application

    1. There are officially 5 types of UUID values, version 1 to 5, but the most common are: time-based (version 1 or version 2) and purely random (version 3). The time-based UUIDs encode the number of 10ns since January 1st, 1970 in 7.5 bytes (60 bits), which is split in a “time-low”-“time-mid”-“time-hi” fashion. The missing 4 bits is the version number used as a prefix to the time-hi field.  This yields the 64 bits of the first 3 groups. The last 2 groups are the clock sequence, a value incremented every time the clock is modified and a host unique identifier.

      There are 5 types of UUIDs (source):

      Type 1: stuffs MAC address+datetime into 128 bits

      Type 3: stuffs an MD5 hash into 128 bits

      Type 4: stuffs random data into 128 bits

      Type 5: stuffs an SHA1 hash into 128 bits

      Type 6: unofficial idea for sequential UUIDs

    2. Even though most posts are warning people against the use of UUIDs, they are still very popular. This popularity comes from the fact that these values can easily be generated by remote devices, with a very low probability of collision.
    1. This basic example compiles a simple Go program. The naive way on the left results in a 961 MB image. When using a multi-stage build, we copy just the compiled binary which results in a 7 MB image.
      # Image size: 7 MB
      FROM golang:1.17.5 as builder
      WORKDIR /workspace
      COPY . .
      RUN go get && go build -o main .
      FROM scratch
      WORKDIR /workspace
      COPY --from=builder \
           /workspace/main \
      CMD ["/workspace/main"]
    2. Docker introduced multi-stage builds starting from Docker Engine v17.05. This allows us to perform all preparations steps as before, but then copy only the essential files or output from these steps.

      Multi-stage builds are great for Dockerfile steps that aren't used at runtime

    3. Making a small change to a file or moving it will create an entire copy of the file. Deleting a file will only hide it from the final image, but it will still exist in its original layer, taking up space. This is all a result of how images are structured as a series of read-only layers. This provides reusability of layers and efficiencies with regards to how images are stored and executed. But this also means we need to be aware of the underlying structure and take it into account when we create our Dockerfile.

      Summary of file duplication topic in Docker images

    4. In this example, we created 3 copies of our file throughout different layers of the image. Despite removing the file in the last layer, the image still contains the file in other layers which contributes to the overall size of the image.
      FROM debian:bullseye
      COPY somefile.txt . #1
      # Small change but entire file is copied
      RUN echo "more data" >> somefile.txt #2
      # File moved but layer now contains an entire copy of the file
      RUN mv somefile.txt somefile2.txt #3
      # File won't exist in this layer,
      # but it still takes up space in the previous ones.
      RUN rm somefile2.txt
    5. We’re just chmod'ing an existing file, but Docker can’t change the file in its original layer, so that results in a new layer where the file is copied in its entirety with the new permissions.In newer versions of Docker, this can now be written as the following to avoid this issue using Docker’s BuildKit:

      Instead of this:

      FROM debian:bullseye
      COPY somefile.txt .
      RUN chmod 777 somefile.txt

      Try to use this:

      FROM debian:bullseye
      COPY --chmod=777 somefile.txt .
    6. when you make changes to files that come from previous layers, they’re copied into the new layer you’re creating.
    7. Many processes will create temporary files, caches, and other files that have no benefit to your specific use case. For example, running apt-get update will update internal files that you don’t need to persist because you’ve already installed all the packages you need. So we can add rm -rf /var/lib/apt/lists/* as part of the same layer to remove those (removing them with a separate RUN will keep them in the original layer, see “Avoid duplicating files”). Docker recognize this is an issue and went as far as adding apt-get clean automatically for their official Debian and Ubuntu images.

      Removing cache

    8. An important way to ensure you’re not bringing in unintended files is to define a .dockerignore file.

      .dockerignore sample:

      # Ignore git and caches
      # Ignore logs
      # Ignore secrets
      # Ignore installed dependencies
    9. You can save any local image as a tar archive and then inspect its contents.

      Example of inspecting docker image:

      bash-3.2$ docker save <image-digest> -o image.tar
      bash-3.2$ tar -xf image.tar -C image
      bash-3.2$ cd image
      bash-3.2$ tar -xf <layer-digest>/layer.tar
      bash-3.2$ ls

      One can also use Dive or Contains.dev

    1. Technology → Cool Stuff to Work OnIntellection → Smart People to Work WithCertainty → Repeatability in Work Environment

      What engineers mostly want from job offers

    1. Smalltalk image is unlike a collection of Java class files in that it can store your programs (both source and compiled bytecode), their data, and execution state. You can quit and save your work while some code is executing, move your image to an entirely different machine, load your image… and your program picks up where it left off.

      Advantage of *Smalltalk** VM

    2. Smalltalk has a virtual machine which allows you to execute your code on any platform where the VM can run. The image however is not only all your code, but the entire Smalltalk system, including said virtual machine (because of course it’s written in itself).

      Smalltalk has a VM

    1. The most important thing you can do is provide actionable feedback. This means being specific about your observations of their work. It also means providing direction about what they could have done differently or what they need to learn or practice in order to improve. If your mentee doesn’t know what to do next, your feedback wasn’t actionable.

      In mentoring, make sure to provide actionable feedback

    1. So when the morale on the team dips (usually related to lack of clarity or setbacks making things take a long time) people leave the team, which further damages morale and can easily result in a mass exodus.
    2. People come and go (which has the net impact of making you feel like you are just a resource)
    1. Instead of “I have a type, it’s called MyType, it has a constructor, in the constructor I assign the property ‘A’ to the parameter ‘A’ (and so on)”, you say “I have a type, it’s called MyType, it has an attribute called a”

      How class declariation in Plain Old Python compares to attr

    2. attrs lets you declare the fields on your class, along with lots of potentially interesting metadata about them, and then get that metadata back out.

      Essence on what attr does

    3. >>> Point3D(1, 2, 3) == Point3D(1, 2, 3)

      attr library includes value comparison and does not require an explicit implementation:

          def __eq__(self, other):
              if not isinstance(other, self.__class__):
                  return NotImplemented
              return (self.x, self.y, self.z) == (other.x, other.y, other.z)
          def __lt__(self, other):
              if not isinstance(other, self.__class__):
                  return NotImplemented
              return (self.x, self.y, self.z) < (other.x, other.y, other.z)
    4. >>> Point3D(1, 2, 3)

      attr library includes string representation and does not require an explicit implementation:

      def __repr__(self):
          return (self.__class__.__name__ +
              ("(x={}, y={}, z={})".format(self.x, self.y, self.z)))
    5. Look, no inheritance! By using a class decorator, Point3D remains a Plain Old Python Class (albeit with some helpful double-underscore methods tacked on, as we’ll see momentarily).

      attr library removes a lot of boilerplate code when defining Python classes, and includes such features as string representation or value comparison.

      Example of a Plain Old Python Class:

      class Point3D(object):
          def __init__(self, x, y, z):
              self.x = x
              self.y = y
              self.z = z

      Example of a Python class defined with attr:

      import attr
      class Point3D(object):
          x = attr.ib()
          y = attr.ib()
          z = attr.ib()
    1. The best way to improve your ability to think is to actually spend time thinking.

      You need to take your time

    2. Thinking means concentrating on one thing long enough to develop an idea about it.


    1. This runs a loop 555 times. Takes a screenshot, names it for the loop number with padded zeros, taps the bottom right of the screen, then waits for a second to ensure the page has refreshed. Slow and dull, but works reliably.

      Simple bash script to use via ADB to automatically scan pages:

      for i in {00001..00555}; do
         adb exec-out screencap -p > $i.png
         adb shell input tap 1000 2000
         sleep 1s
      echo All done
    1. Gary Klein himself has made a name developing techniques for extracting pieces of tacit knowledge and making it explicit. (The technique is called ‘The Critical Decision Method’, but it is difficult to pull off because it demands expertise in CDM itself).

      AI can help with turning tacit knowledge into explicit

    2. If you are a knowledge worker, tacit knowledge is a lot more important to the development of your field of expertise than you might think.

      Tacti knowledge is especially important for knowledge workers

    3. Notice how little verbal instruction is involved. What is more important is emulation, and action — that is, a focus on the embodied feelings necessary to ride a bicycle successfully. And this exercise was quite magical for me, for within the span of an hour I could watch a kid go from conscious incompetence to conscious competence and finally to unconscious competence.In other words, tacit knowledge instruction happens through things like imitation, emulation, and apprenticeship. You learn by copying what the master does, blindly, until you internalise the principles behind the actions.

      In learning, imitation, emulation and action are very important.

    4. When I was a kid, I taught myself how to ride a bike … by accident. And then I taught my sisters and then my cousin and then another kid in the neighbourhood who was interested but a little scared. They were zooming around in about an hour each. The steps were as follows:

      Interesting example on how to teach oneself to ride a bike (see steps below)

    5. Tacit knowledge is knowledge that cannot be captured through words alone.

      Tacit knowledge

  6. Dec 2021
    1. Artifactory/Nexus/Docker repo was unavailable for a tiny fraction of a second when downloading/uploading packagesThe Jenkins builder randomly got stuck

      Typical random issues when deploying microservices

    2. Microservices can really bring value to the table, but the question is; at what cost? Even though the promises sound really good, you have more moving pieces within your architecture which naturally leads to more failure. What if your messaging system breaks? What if there’s an issue with your K8S cluster? What if Jaeger is down and you can’t trace errors? What if metrics are not coming into Prometheus?

      Microservices have quite many moving parts

    3. If you’re going with a microservice:

      9 things needed for deploying a microservice (listed below)

    4. Let’s take a simple online store app as an example.

      5 things needed for deploying a monolith (listed below)

    5. some of the pros for going microservices

      Pros of microservices (not always all are applicable):

      • Fault isolation
      • Eliminating the technology lock
      • Easier understanding
      • Faster deployment
      • Scalability
    1. I think one of the ways that remote work changes this is that I can do other things while I think through a tricky problem; I can do dishes or walk my dog or something instead of trying to look busy in a room with 6-12 other people who are furiously typing because that's how the manager and project manager understand that work gets done.

      Way work often looks like during remote dev work

    1. docker scan elastic/logstash:7.13.3 | grep 'Arbitrary Code Execution'

      Example of scanning docker image for a log4j vulnerability

    1. AAX is Pro Tools' exclusive format for virtual synths. So you really only need it if you use Pro Tools (which doesn't support the VST format).

      Only install VST plugin format (not AAX) for FL Studio

    1. When you usually try to download an image, your browser opens a connection to the server and sends a GET request asking for the image. The server responds with the image and closes the connection. Here however, the server sends the image, but doesn't close the connection! It keeps the connection open and sends bits of data periodically to make sure it stays open. So your browser thinks that the image is still being sent over, which is why the download seems to be going on infinitely.

      How to not let the user downloading an image

    1. be microfamous. Microfame is the best kind of fame, because it combines an easier task (be famous to fewer people) with a better outcome (be famous to the right people).

      Idea of being microfamous over famous

    1. Windows 10 Enterprise LTSC 2021 (tak brzmi pełna, poprawna nazwa wersji 21H2 LTSC) będzie otrzymywał aktualizacje do stycznia 2032.

      Windows 10 LTSC

    2. Doświadczeni użytkownicy wiedzieli jednak, że należy czekać na Server. W przypadku Windows 11, wydaje się że należy czekać na wersję LTSC.

      You may want to wait for Windows 11 LTSC before updating from Windows 10 which gets LTSC first

    1. if it's important enough it will surface somewhere somehow.

      Way to apply FOMO

    2. The Feynman technique goes as follows: write down what you know about a subject. Explain it in words simple enough for a 6th grader to understand. Identify any gaps in your knowledge and read up on that again until all of the explanation is dead simple and short.

      The Feyman technique for learning

    3. Similar to frequency lists in a natural language there are concepts in any software product that the rest of the software builds upon. I'll call them core concepts. Let's use git as an example. Three core concepts are commits, branches and conflicts.If you know these three core concepts you can proceed with further learning.

      To speed up learning, start with the core concepts like frequency lists for learning languages

  7. Nov 2021
    1. We can also see that converting the original non-dithered image to WebP gets a much smaller file size, and the original aesthetic of the image is preserved.

      Favor converting images to WebP over ditchering them

    1. A lot of us are going to die of unpredictable diseases, some of us young. Really, don't spend your life getting fitter, healthier, more productive. We are all going to die, and Earth will explode in the Sun in a few billion years: please, enjoy some now.


    1. I’d probably choose the official Docker Python image (python:3.9-slim-bullseye) just to ensure the latest bugfixes are always available.

      python:3.9-slim-bullseye may be the sweet spot for a Python Docker image

    2. So which should you use? If you’re a RedHat shop, you’ll want to use their image. If you want the absolute latest bugfix version of Python, or a wide variety of versions, the official Docker Python image is your best bet. If you care about performance, Debian 11 or Ubuntu 20.04 will give you one of the fastest builds of Python; Ubuntu does better on point releases, but will have slightly larger images (see above). The difference is at most 10% though, and many applications are not bottlenecked on Python performance.

      Choosing the best Python base Docker image depends on different factors.

    3. There are three major operating systems that roughly meet the above criteria: Debian “Bullseye” 11, Ubuntu 20.04 LTS, and RedHat Enterprise Linux 8.

      3 candidates for the best Python base Docker image

    1. While people who are both trustworthy and competent are the most sought after when it comes to team assembly, friendliness and trustworthiness are often more important factors than competency.
    2. The researchers found that people who exhibited both competence, through the use of challenging voice, and trustworthiness, through the use of supportive voice, were the most in-demand people when it came to assembling teams.
      • Challenging voice: Communicating in a way that challenges the status quo and is focused on new ideas and efficiency.
      • Supportive voice: Communicating in a way that strengthens social ties and trust, and builds friendly cohesion of a team.
    1. Feature GraphQL REST

      GraphQL vs *REST (table)

    2. There are advantages and disadvantages to both systems, and both have their use in modern API development. However, GraphQL was developed to combat some perceived weaknesses with the REST system, and to create a more efficient, client-driven API.

      List of differences between REST and GraphQL (below this annotation)

    1. special permission bit at the end here t, this means everyone can add files, write files, modify files in the /tmp directory, but only root can delete the /tmp directory

      t permission bit

    1. We implemented a bash script to be installed in the master node of the EMR cluster, and the script is scheduled to run every 5 minutes. The script monitors the clusters and sends a CUSTOM metric EMR-INUSE (0=inactive; 1=active) to CloudWatch every 5 minutes. If CloudWatch receives 0 (inactive) for some predefined set of data points, it triggers an alarm, which in turn executes an AWS Lambda function that terminates the cluster.

      Solution to terminate EMR cluster; however, right now EMR supports auto-termination policy out of the box

    1. git ls-files is more than 5 times faster than both fd --no-ignore and find

      git ls-files is the fastest command to find entries in filesystem

    1. If we call this using Bash, it never gets further than the exec line, and when called using Python it will print lol as that's the only effective Python statement in that file.
      "exec" "python" "myscript.py" "$@"
    2. For Python the variable assignment is just a var with a weird string, for Bash it gets executed and we store the result.

      __PYTHON="$(command -v python3 || command -v python)"

    1. There are a handful of tools that I used to use and now it’s narrowed down to just one or two: pandas-profiling and Dataiku for columnar or numeric data - here’s some getting started tips. I used to also load the data into bamboolib but the purpose of such a tool is different. For text data I have written my own profiler called nlp-profiler.

      Tools to help with data exploration:

      • pandas-profiling
      • Dataiku
      • bamboolib
      • nlp-profiler
    1. If for some reason you don’t see a running pod from this command, then using kubectl describe po a is your next-best option. Look at the events to find errors for what might have gone wrong.

      kubectl run a –image alpine –command — /bin/sleep 1d

    2. As with listing nodes, you should first look at the status column and look for errors. The ready column will show how many pods are desired and how many are running.

      kubectl get pods -A -o wide

    3. -o wide option will tell us additional details like operating system (OS), IP address and container runtime. The first thing you should look for is the status. If the node doesn’t say “Ready” you might have a problem, but not always.

      kubectl get nodes -o wide

    4. This command will be the easiest way to discover if your scheduler, controller-manager and etcd node(s) are healthy.

      kubectl get componentstatus

    5. If something broke recently, you can look at the cluster events to see what was happening before and after things broke.

      kubectl get events -A

    6. this command will tell you what CRDs (custom resource definitions) have been installed in your cluster and what API version each resource is at. This could give you some insights into looking at logs on controllers or workload definitions.

      kubectl api-resources -o wide –sort-by name

    7. kubectl get --raw '/healthz?verbose'

      Alternative to kubectl get --raw '/healthz?verbose'. It does not show scheduler or controller-manager output, but it adds a lot of additional checks that might be valuable if things are broken

    8. Here are the eight commands to run

      8 commands to debug Kubernetes cluster:

      kubectl version --short
      kubectl cluster-info
      kubectl get componentstatus
      kubectl api-resources -o wide --sort-by name
      kubectl get events -A
      kubectl get nodes -o wide
      kubectl get pods -A -o wide
      kubectl run a --image alpine --command -- /bin/sleep 1d
    1. 80% of developers are "dark", they dont write or speak or participate in public tech discourse.

      After working in tech, I would estimate the same

    2. They'll teach you for free. Most people don't see what's right in front of them. But not you. "With so many junior devs out there, why will they help me?", you ask. Because you learn in public. By teaching you, they teach many. You amplify them.

      Senior engineers can teach you for free if you just open up online