- Oct 2024
-
lathamt.substack.com lathamt.substack.com
-
You can’t connect information if you don’t know it.
-
- Sep 2024
-
www.pearlleff.com www.pearlleff.com
-
Imagine you're an engineering manager. Who would you rather hire: the person who knows exactly what features are available in PHP 7 and which are only available in PHP 8, or the one who will figure it out by trial-and-error of while writing each application and seeing what fails? Of course, the second engineer certainly may produce quality work. But the first one unquestionably has an comprehensively organized framework of the tools he has at his disposal.
Instead of griping about "memorizing trivia" for job interviews, we'd all do better throughout our careers if we put ourselves into the EM's shoes.
-
- Apr 2024
-
www.computer.org www.computer.org
-
Mathematical software providers would have preferred to standardize high-level language facilities, but there was no chance of getting language standards interested until there were some hardware implementations and some compilers and libraries proving the concepts.
Order of operations
-
- Jun 2023
-
inst.eecs.berkeley.edu inst.eecs.berkeley.edu
-
GLUT framework
Installed FreeGLUT with Homebrew:
brew install freeglut
-
-
sre.google sre.google
-
20,000 clicks/second and 1,000,000 queries/second.
That estimated 2% CTR helps inform design decisions again
-
In practice, we’d want to add more processes to increase parallelism—to handle accumulated backlog after any downtime or traffic spikes.
Consider how the system bounces back.
-
We’ve just described an example of the well-known consensus problem in distributed systems engineering.
This is why we learn about well-known problems, so we can pattern-match our specific problem against well-known, generically-described problems to find known solutions and work-arounds.
-
We need to evolve our design to use more than one datacenter.
What's it like to operate at this scale?
-
all its work must be redone
Find the places where all work must be redone, even if only part of the system fails.
-
keep implementations consistent
This seems like a hint that the system is moving in the desired direction, that the implementations are able to be consistent.
-
we know from our one-machine iteration that the individual small writes are too frequent to store on a hard drive
Keep those learnings handy while iterating on the design, even when discarding initial designs.
-
there are potentially three ad_ids that could be clicked on for each search query
Ah, I was wondering where the earlier "three ad_ids" came from -- this is a limit that comes from the product, which is how many ads may be shown on the search results page.
-
ignore ad_id and search_term because they are a small linear factor
Key: identify small linear factors; what's actually capable of getting huge?
-
We can delete data that’s too old to be of value.
Is this data from QueryStore that doesn't/won't have a corresponding click from ClickLog?
-
introducing a new distributed system component
At this point in my understanding of systems, my knee-jerk take it to default to finding spots to remove component. I appreciate seeing how this mindset isn't always helpful.
-
For simplicity’s sake
It seems like that keeping this perspective would be useful in practice, and not just for the purpose of keeping an article, like this one, moving.
-
we can round our 86.4 TB/day up to 100 TB of space required
Rounding up as part of the iterative process
-
we can use scientific notation to limit errors caused by arithmetic on inconsistent units
Huh?
-
because the average CTR is 2% (10,000 clicks / 500,000 queries), the click log will have 2% as many records as the query log.
See: napkin math and the value of already having your relevant numbers when starting to design a new system
-
scanning through the query log and the click log to generate the dashboard will be very inefficient
Naïve implementation would be O(m x n)?
-
Achieving this SLO requires that the speed of calculating a CTR remains constant as the system handles large amounts of clicks and queries.
Constraint set by SLO determines solution's acceptable bounds
-
Click logs are derived from URLs, which have inherent size limitations, making the separate query log a more scalable solution.
This is unclear to me.
-
scale up our basic design
Step 2.
Then, verify that it's possible to implement, determine in what ways it's resilient, and then find the components/interactions to improve.
-
invent a design that works in principle
Step 1.
Then, verify that it's possible, and figure out how to do it beter.
-
This analysis either feeds into the next iteration or indicates when the design is good enough to recommend.
Refinement via a feedback cycle
-
CTR is the ratio of times the ad is clicked versus the number of times the ad is shown.
🔑
-
Ultimately, we arrive at a system that defends against many failure modes and satisfies both the initial requirements and additional details that emerged as we iterated
Iteration as a way to surface unobvious requirements
-
By following an iterative style of system design and implementation, we arrive at robust and scalable designs with low operational costs.
Goal.
-
- May 2023
-
interconnected.org interconnected.org
-
problem
Next up would be to smooth the way for growth, as Webb says.
-
a coordination problem
This insight is part of what made Licklider so successful in getting ARPANet going -- from his vantage point at ARPA, he was able to identify what needed doing, and was able to find the folks who could do it.
-
I think this viewpoint was developed and espoused (in the form of funding preferences and memos) by inaugural IPTO director at ARPA (1962-1964) J. C. R. Licklider.
I would agree. :D
-
M. Mitchell Waldrop’s biography
Just in case anyone else needs to hear this from someone else: great book.
-
Protocols are just agreed ways to communicate. A protocol embodies an architecture of participation.
Key idea!
-
These are both ecosystems that provide infrastructure by harnessing market forces. Get it right, and the incentives align towards getting cheaper, better, and more accessible.
"Ecosystem" seems to be one of the load-bearing ways of thinking about this scale of project -- crucial!
-
- Mar 2023
-
about.sourcegraph.com about.sourcegraph.com
-
until you realize that it’s still just incredibly tiny compared to real-world code bases
More ammunition for the microservices crowd?
-
-
scrollprize.org scrollprize.org
-
Our expectation is that you will build on these techniques, improving the tools and models.
Will incremental progress be enough to read these scrolls?
-
- Jan 2023
-
interconnected.org interconnected.org
-
No, I think the big problem will be that legal system only works for humans.
Been thinking about this for AGI -- how does the legal system punish an AGI by degrees? What's lock-up look like for an AGI? Or fines? I think the knee-jerk reaction would be just turn it off, but wouldn't that just be an execution?
-
People readily find humanity in the unlikeliest of places
"The program gets mad with that input."
-
a tree
Reminds me of this exhibit I saw at the Carnegie Museum of Art: https://nextpittsburgh.com/latest-news/a-tree-is-one-of-the-most-important-works-in-the-carnegie-international/
-
- Dec 2022
-
www.worksinprogress.co www.worksinprogress.co
-
By the early 1900s, Cleveland, Ohio was instrumental in pushing forward the frontier on electricity and steel.
Ahem.
-
-
www.worksinprogress.co www.worksinprogress.co
-
Next time you hear someone complaining about a baby crying on an airplane, tell them they’re being a jerk.
👏
-
While immigration undoubtedly helps with an aging population, it also has a destabilising effect on democracies, which leads to bad electoral outcomes and policies.
Perhaps not immigration itself, but negative attitudes toward immigration.
-
Malthusian trap
"...is the idea that population growth is potentially exponential while the growth of the food supply or other resources is linear, which eventually reduces living standards to the point of triggering a population die off."
Did not know the name for this!
-
-
costanoa.vc costanoa.vc
-
The real differentiator is the collection, curation and use of data to improve user experience of end users, whether for the development of foundation models or in applications.
This ties back to the importance of having a great product that folks will want to keep using.
-
- Jun 2022
-
herman.bearblog.dev herman.bearblog.dev
-
Coming from game development
I think we'd all benefit from more cross-pollination like this.
-
it is easier to modify, build on, and maintain
The old adage comes to mind -- 'the fastest way to get code to production is to call it a 'demo.'"
-
Instead of building MVPs, we should be building SLCs. Something Simple, Loveable, and Complete.
The main point
-
-
ceeeej-blog.tumblr.com ceeeej-blog.tumblr.com
-
One night, in desperation, I made a long list of concepts that I wanted to reflect in V, moving from one to another with a rapid free- association that would make any good psychiatrist reach for the emergency cord. The list was something as follows;Orwell. Huxley. Thomas Disch. Judge Dredd. Harlan Ellison’s “'Repent, Harlequin!’ Said the Ticktockman.” “Catman” and “Prowler in the City at the Edge of the World” by the same author. Vincent Price’s Dr. Phibes and Theatre Of Blood. David Bowie. The Shadow. Nightraven. Batman. Fahrenheit 451. The writings of the New Worlds school of science -fiction. Max Ernst’s painting “Europe After The Rains,” Thomas Pynchon, The atmosphere of British Second World War films. The Prisoner. Robin Hood. DickTurpin…
Brainstorming
-
- May 2022
-
ckrybus.com ckrybus.com
-
high coherence of process information, high process complexity and high process controllability (whether manual or by adequate automatics) were all associated with low levels of stress and workload and good health, and the inverse, while fast process dynamics and a high frequency of actions which cannot be made directly on the interface were associated with high stress and workload and poor health. High process controllability, good interface ergonomics and a rich pattern of activities were all associated with high feeling of achievement. Many studies show that high levels of stress lead to errors, whitc poor health and low job satisfaction lead to the high indirect costs of absenteeism, etc. (e.g. Mobley and colleagues, 1979i.
Years of justification for the dev ex teams getting more resources
-
The level of skill that a worker has is also a major aspect of his status, both within and outside the working community.
How do we find meaning in work when our work in meaning less and less?
-
The second problem is that if the decisions can be fully specified then a computer can make them more quickly, taking into account more dimensions and using more accurately specified criteria than a human operator can. There is therefore no way in which the human operator can check in real-time that the computer is following its rules correctly.
Then, we would need a second computer to monitor all that a first computer is processing. But, then, how do we know that our redundant system is working properly, without incorporating yet another monitor?
-
We know from many 'vigilance" studies (Mackworth, 1950) that it is impossible for even a highly motivated human being to maintain effective visual attention towards a source of information on which very little happens, for more than about half an hour.
Meta: should I cap this stream to half an hour?
-
Manual operators may come into the control room quarter to half an hour before they are due to take over control, so they can get this feel for what the process is doing.
How can we do this with software systems? Even on a single machine, there are hundreds+ processes running at once, so getting an idea of a machine's current state becomes so difficult that it borders on impossible; we can keep track of exceptional states because they are (ostensibly) rare, catastrophic, and show-stopping. But, the normal, "all green" state? There's too much going on.
-
There is some concern that the present generation of automated systems, which are monitored by former manual operators, are riding on their skills, which later generations of operators cannot be expected to have.
See: Jonathan Blow lecture on how civilizations collapse -- https://www.youtube.com/watch?v=pW-SOdj4Kkk
-
One is that efficient retrieval of knowledge from long- term memory depends on frequency of use (consider any subject which you passed an examination in at school and have not thought about since).
Making a case for using Anki for common error codes and situations at work/on codebases.
-
This means that a formerly experienced operator who has been monitoring an automated process may now be an inexperienced one.
Atrophies. "I know I've seen this before, but how did I solve it last year?"
-
the operator can be left with an arbitrary collection of tasks, and little thought may have been given to providing support for them.
Talk to your constituents!
-
The difficulty remains that they are less effective when under time pressure
Difference in pace when doing a regular release, compared to having to rollback due to a catastrophic defect in prod
-
"even highly automated systems, such as electric power networks, need human beings for supervision, adjustment, main.tenance, expansion and improvement. Therefore one can draw the paradoxical conclusion that automated systems still are man-machine systems, for which both technical and human factors are important."
Compare/contrast to computing machines that taught operators applications of calculus
-
- Apr 2022
-
newschematic.org newschematic.orgLFS.pdf24
-
he only operation that can’t be completed is thecreation of a newfile for which the inode is never written;in this case the directory entry will be removed
How did they test this? Were they able to hook into syscalls and break things on purpose?
-
Sprite LFS currently uses acheckpoint interval of thirty seconds, which is probablymuch too short.
What would it take to turn this value into something that's dynamic and self-tuned by the system as it runs a user's specific workload?
-
In order to handle a crash during acheckpoint operation there are actually two checkpointregions, and checkpoint operations alternate between them.
Redundancies for your redudancies
-
The key to achieving high performance at low cost ina log-structuredfile system is to force the disk into a bimo-dal segment distribution where most of the segments arenearly full, a few are empty or nearly empty, and thecleaner can almost always work with the empty segments.
Generalized learnings
-
Figure 3 suggeststhat the segments cleaned must have a utilization of lessthan .8 in order for a log-structuredfile system to outper-form the current Unix FFS; the utilization must be less than.5 to outperform an improved Unix FFS.
Working on policies 3 & 4 from above
-
we callthis approachage sort
"Generational"
-
The version number combined with the inode number forman unique identifier (uid) for the contents of thefile.
We'd usually jump to hashing for unique identifiers at higher levels of systems
-
If it does, then the block is live; if itdoesn’t, then the block is dead
Reference counting
-
segment summary block
Solves the problems of marking which blocks are live, as well as identifying a block's files and the position of the block within the file
-
This allowswhole-segment operations to run at nearly the fullbandwidth of the disk, regardless of the order in which seg-ments are accessed
What would it take to make this sizing dynamic?
-
Once afile’s inode has beenfound, the number of disk I/Os required to read thefile isidentical in Sprite LFS and Unix FFS.
Read performance is on-par; implementation is the same.
-
The fundamental idea of a log-structuredfile systemis to improve write performance by buffering a sequence offile system changes in thefile cache and then writing all thechanges to disk sequentially in a single disk write opera-tion.
star
-
In designing a log-structuredfile system we decided tofocus on the efficiency of small-file accesses, and leave itto hardware designers to improve bandwidth for large-fileaccesse
Limiting scope of research to make it applicable to the general case.
-
several studies have measured meanfilesizes of only a few kilobytes
How has this changed over time? Seems like we should have an OS utility to collect this information.
-
Buffering
Fundamental question: how does this logging approach related to buffering?
-
for applicationsthat require better crash recovery, non-volatile RAM maybe used for the write buffer.
Change the hardware to handle this edge case?
-
Disk transfer bandwidth
How has this improved with SSD's?
-
echnology, which provides a set of basic buildingblocks, and workload, which determines a set of operationsthat must be carried out efficiently.
Different contexts for our technology determine the workload, ie file system for smartphone has different needs than for a desktop than for a cloud machine
-
System
- Category: file systems
- Context: examining and improving bottlenecks in OS speed in late 80s/early 90s
- Correctness: bold claims ("order of magnitude"), but seem justified
- Contributions: many which are used today
- Clarity: seems clear so far
-
Why Aren’t Operating Sys-tems Getting Faster AsFast as Hardware?
TODO read
-
collect large amounts of new data in afile cache in main memory, then write the data to disk in asingle large I/O that can use all of the disk’s bandwidth
Buffering
-
nix systemscan only utilize 5-10% of a disk’s raw bandwidth for writ-ing new data; the rest of the time is spent seeking
How does this compare to the *nix systems of today?
-
t segregates older, more slowly changing datafrom young rapidly-changing data and treats them dif-ferently during cleaning.
Compare/contrast: generational garbage collection
-
February 1992
Predates LevelDB, et al, by a good bit
-
-
bradfieldcs.com bradfieldcs.comHashing1
-
With chaining, increased collisions means an increased number of items on each chain.
And worse cache locality, therefore, worse performance
-
-
-
Go has a culture of rejecting large dependency trees, and of preferring a bit of copying to adding a new dependency.
How to apply this culture to others?
-
-
www.energy.gov www.energy.gov
-
Right now, roughly 95% of the hydrogen produced in the U.S. comes from natural gas. It’s produced through a process called steam methane reforming and emits roughly 830 million tonnes of carbon dioxide per year(link is external)
fml
(as well as my children's lives)
-
-
-
Published: 27 Dec 2013
This would be some good research to reproduce 10 years on, and compare/contrast how the space has improved (or not!).
-
-
danluu.com danluu.com
-
What we should do about this is a big topic, in the time we have left, one thing we can do instead of writing to files is to use databases.
Oof.
-
- Mar 2022
-
70sbig.com 70sbig.com70's Big1
-
Life is inherently full of unavoidable and heartbreaking suffering. At some point, you or everyone you love will die. It’s unavoidable. So, in between bouts of unavoidable suffering, we benefit from seeking out the beauty life entails.
Lovely.
-
-
www.theguardian.com www.theguardian.com
-
We need to rediscover our ambition to improve public health as we did in the 18th and 19th centuries.
Especially as populations get older and therefore more vulnerable to disease in general, it's wild that this isn't a higher priority.
-
-
ai.googleblog.com ai.googleblog.com
-
6: int mid = low + ((high - low) / 2);
Note to self
-
-
-
'm going to make another note about the close()/dup() combination since it's pretty weird. close(1) frees up file descriptor 1 (standard output). dup(pfds[1]) makes a copy of the write-end of the pipe in the first available file descriptor, which is "1", since we just closed that. In this way, anything that ls writes to standard output (file descriptor 1) will instead go to pfds[1] (the write end of the pipe).
This is a big gotcha. We need to close this, or else the OS won't understand that the input we don't care about isn't being used, and will continue listening from that unused input.
-
read from the other end in the order it came in
FIFO queue
-
pipes will fill up after you write about 10K to them without reading anything out.
Good trivia, and maybe a fun thing to figure out how to measure on different OS's.
-
File descriptors are simply ints that are analogous to FILE*'s in stdio.h.
Come back to this.
-
-
newschematic.org newschematic.org
-
Since all source programswere always available and easily modified on-line, we werewilling to revise and rewrite the system and its softwarewhen new ideas were invented, discovered, or suggested byothers.
Consideration #3
-
Second there have always been fairly severe size con-straints on the system and its software
Consideration #2
-
First, since we are programmers, we naturally designedthe system to make it easy to write, test, and run programs.
Consideration #1
-
Our goals through-out the effort, when articulated at all, have always con-cerned themselves with building a comfortable relationshipwith the machine and with exploring ideas and inventionsin operating systems.
Dev Ex (developer experience), but also open-ended research!
-
dissatisfied
The history of scratching an itch turning into something Way Bigger
-
floating point hardware
We haven't always had floating point, much less floating point chips!
-
There is also a quit signal which is used to force a coreimage to be produced. Thus programs which loop unex-pectedly may be halted and the core image examined with-out prearrangement.
Has this functionality been carried over to GNU/Linux and/or macOS?
-
Such faults cause the processor to trapto a system routine.
Is "trap" really the most descriptive word for this?
-
Thus when editing system userslog in, they are inside the editor and can begin work imme-diately;
Emacs machines!
-
Thus a user may log out simply by typing the end-of-file sequence in place of a command to the Shell.
When working in a different shell (bash within zsh), I'll type "exit" when I'm done.
-
This is easy because, byagreement, the smallest unused file descriptor is assignedwhen a new file is opened (or created); it is only necessaryto close file 0 (or 1) and open the named file.
Blew my mind when I first learned about how this happens "by agreement"!
-
The
Lots packed into this one paragraph!
-
When this happens, the Shellknows the command is finished, so it types its prompt andreads the typewriter to obtain another command.
Seems like, yes, shell is parent process for everything else.
-
The Shell also returns immedi-ately for another request.
Will we know our tasks are finished when the input suddenly appears?
-
Some filters which we have found useful
Are we limited to functions that operate on a line-by-line basis?
-
Actually it would be surprising, and in fact unwisefor efficiency reasons, to expect authors of commands suchas ls to provide such a wide variety of output options.
Another reason to play nicely with conventions -- less work to do to implement tons of functionality provided by other, existing programs
-
in fact it is interpretedcompletely by the Shell and is not passed to the commandat all.
The shell does a lot of work behind the scenes (thinking about this, as well as splitting command-line args)
-
the command need merelyuse the standard file descriptors 0 and 1 where appropriate.
Use shell's conventions to play nicely with other programs
-
ile descriptors 0 and 1
STDIN and STDOUT, in today's parlance
-
If file command cannot be found, the Shell prefixes thestring /bin/ to command and attempts again to find the file.Directory /bin contains all the commands intended to begenerally used.
Presage the PATH variable?
-
Wa i t may also presentstatus from a grandchild or more distant ancestor;
Unexpected!
-
Ordinarily, arg1 should be the same string as file,so that the program may determine the name by which itwas invoked.
"Ordinarily"? What's extraordinarily?
-
the pipe must be set up by a commonancestor of the processes involved
Since they reference a section of the Shell portion of the paper, could the shell be considered a common ancestor?
-
flies
🪰🪰
-
using the same system read and write calls that are used forfile system I/O
This usage of the same calls for different contexts is 💯
-
fork (label)
API is different nowadays.
-
the size of which may be extended by a systemcall.
Which system call(s)?
-
We will notattempt any interpretation of these figures nor any compari-son with other systems, but merely note that we are gener-ally satisfied with the overall performance of the system.
LGTM!
-
Thecurrent version of UNIX avoids the issue by not chargingany fees at all.
What kind of fees are they talking about? $$$?
-
To the user, both reading and writing of files appear tobe synchronous and unbuffered.
Not actually the case! Batch writes to maintain this illusion while keeping up performance (more goes into it than this)
-
These files may be as large as 8⋅256⋅512, or l,048,576 (220)bytes.
~1 megabyte
-
If the link-count drops to 0, any disk blocksin the file are freed and the i-node is deallocated.
How they handle deletes
-
This pointer is an integer called the i-number (forindex number) of the file. When the file is accessed, its i-number is used as an index into a system table (the i-list)stored in a known part of the device on which the directoryresides. The entry thereby found (the file’s i-node) containsthe description of the file as follows.
Still how we do it!
-
It is possible to generate an end-of-filefrom a typewriter by use of an escape sequence whichdepends on the device used
ctrl + d
-
he byte array
Still using the same ways of representing data!
-
truncates it to zero length if it does exist
Foot gun?
-
Since theactual user ID of the invoker of any program is alwaysavailable
Can we still look at these today on UNIX-like systems?
-
There is a threefold advantage in treating I/O devicesthis way: file and device I/O are as similar as possible; fileand device names have the same syntax and meaning, sothat a program expecting a file name as a parameter can bepassed a device name; finally, special files are subject to thesame protection mechanism as regular files.
Keep those interfaces the same!
-
byconvention
"by convention" -- is this not enforced somehow?
-
UNIX differs from other systems in which link-ing is permitted in that all links to a file have equal status.That is, a file does not exist within a particular directory;the directory entry for a file consists merely of its name anda pointer to the information actually describing the file.
Still the case!
-
Files are named by sequences of 14 or fewer charac-ters.
Why did this limit exist?
-
All files in the sys-tem can be found by tracing a path through a chain of direc-tories until the desired file is reached.
Path down the tree
-
symbolic or binary (object) programs
Was this novel, having these two types of files?
-
However, the structure of files is controlled by theprograms which use them, not by the system
Agnostic
-
The PDP-11 has a 1M byte fixed-head disk, used for filesystem storage and swapping, four moving-head disk driveswhich each provide 2.5M bytes on removable disk car-tridges, and a single moving-head disk drive which usesremovable 40M byte disk packs
TODO find pictures of these
-
All of these programs were writtenlocally. It is worth noting that the system is totally self-sup-porting.
Relate to Bootstrapping, as envisioned by Douglas Engelbart -- the idea that the thing you're building can help you build that thing faster and more effectively
-
preparationand formatting of patent applications
Dog fooding?
-
(1) a hierarchical file system incorpo-rating demountable volumes; (2) compatible file, device, and inter-process I/O; (3) the ability to initiate asynchro-nous processes; (4) system command language select-able on a per-user basis; and (5) over 100 subsystems including a dozen languages.
Compare to the three main responsibilities of OS's as outlined in OSTEP.
-
-
www.man7.org www.man7.org
-
The list of arguments must be terminated by a null pointer, and, since these are variadic functions, this pointer must be cast (char *) NULL.
myargs[0] = strdup("wc"); // program: "wc" (word count) myargs[1] = strdup("p3.c"); // argument: file to count myargs[2] = NULL; // marks end of array execvp(myargs[0], myargs); // runs word count
From OSTEP, ch 5: https://pages.cs.wisc.edu/~remzi/OSTEP/cpu-api.pdf
-
-
www.theatlantic.com www.theatlantic.com
-
The word meritocracy has been around since the late 1950s, when a British sociologist named Michael Young published The Rise of the Meritocracy. He meant this new word as a warning
TIL
-
-
www.hyrumslaw.com www.hyrumslaw.com
-
consumers often come to expect a certain level of performance from its implementation.
Having to program around poor performance from a third-party API, and then how that breaks when the API improves
-
-
pages.cs.wisc.edu pages.cs.wisc.edu
-
Brinch Hansen’s excellenthistory
-
-
devblogs.microsoft.com devblogs.microsoft.com
-
which would skipped over by JavaScript engines
But still sent across the wire, yes?
-
But if those trends we mentioned above continue, compiling away your types might be the only step between writing your TypeScript and running it, and we don’t want to be the ones standing in the way of a good developer experience!
I wonder if they've tried tools like esbuild or swc, and factored in those times in their assessment of the ecosystem.
-
evergreen browsers
TODO find out which browsers he means
-
-
-
You don’t send a message unless you need to teach someone else a lesson. The one you’re making an example of isn’t the one you’re trying to teach a lesson. It’s the ones who are next in line. Putin is telling us he won’t stop here. We are the intended recipients of this message he is sending by making an example of Ukraine, in blood and death.
The main point
-
-
ccr.sigcomm.org ccr.sigcomm.org
-
It helps to jot down the keypoints, or to make comments in the margins, as you read.
A great blog post on how to do this well, aimed at book readers: https://jamesstuber.com/how-to-read/
-
read their related worksections
Finding commonly-referenced work, building connections between researchers and topics.
-
that is,making the same assumptions as the authors, re-create thework.
What to do on the third pass
-
To fully understand a paper, particularly if you are re-viewer, requires a third pass
When to take the third pass
-
Common mistakes like these will separaterushed, shoddy work from the truly excellent.
Reality check
-
you can expect mostreviewers (and readers) to make only one pass over it
Keep the end-users in mind!
-
Thefirst pass is adequate for papers that aren’t in your researcharea, but may someday prove relevant.
And then capture in a note-taking system?
-
A typi-cal researcher will likely spend hundreds of hours every yearreading papers
Unexpected -- 160 working hours in a month, so multiple months worth of reading papers?
-
However, this skill is rarely taught, leading to muchwasted effort
Along with other skills related to programming
-
-
-
10x speed
This app allows for up to 16x playback:
cmd + [
to double the playback speed. Quicktime only goes up to 2x and VLC not much more than that. -
I highly recommend recording your screen.
On macOS, I've done this with the built-in Quicktime application. It created a large file, but worked quite well.
-
- Feb 2022
-
github.com github.com
-
Recovery
When designing systems, is it useful to design the recovery process up-front, and have the rest of the design decisions flow from there?
-
we could shard the set of files into multiple directories.
todo: revisit the meaning of this suggestion
-
Assuming a disk IO rate of 100MB/s (ballpark range for modern drives)
"Latency numbers every programmer should know"
-
level
It was certainly helpful (for me!) to read Chapter 3 of Designing Data-Intensive Applications before reading this doc.
-
a special young level
I immediately think of generational garbage collection. Is this an instance of using concepts from one field (GC) and applying it to another?
-
These merges have the effect of gradually migrating new updates from the young level to the largest level using only bulk reads and writes (i.e., minimizing expensive seeks).
The benefit of this approach
-
This copy is consulted on every read so that read operations reflect all logged updates.
Losing this ability is a downside of using e.g. ElasticSearch as the (very, very fast) user-facing data store, but not the actual source of truth.
-
-
jamesstuber.com jamesstuber.com
-
You’ve written a note. Mark the page with a sticky flag or dog-ear it for access later.
This suggestion is new to me.
-
- Jan 2022
-
http2-explained.haxx.se http2-explained.haxx.se
-
With ALPN, the client gives the server a list of protocols in its order of preference and the server picks the one it wants, while with NPN the client makes the final choice.
Difference
-
ALPN is being promoted for use by http2, while SPDY clients and servers still use NPN.
Is SPDY deprecated? If so, does anyone still use NPN?
-
-
medium.com medium.com
-
Build an MVP” is the mental image that comes to mind, but “Get One User” is a better answer.
-
-
blog.nelhage.com blog.nelhage.com
-
Learning more about software systems is a compounding skill. The more systems you’ve seen, the more patterns you have available to match future systems against, and the more skills and tricks and techniques you develop to apply to future problems.
"Knowledge accretes."
-
It’s nearly always worth trying the easier approach first (upgrading a dependency, reaching for the debugger, a few passes of trial-and-error cargo-culting from working examples), and only reach for the big guns if those tools fail you.
TODO: adapt Polya's checklist, meshed with Acton's go-to's here from this talk -- https://www.youtube.com/watch?v=4B00hV3wmMY&t=785s
-
However, that wasn’t my goal; I just wanted something that worked, within a time budget. And so, I instead just ended up without anything that worked, and without much of a better understanding of anything, either.
Constant evaluation of whether or not one's process is getting one closer to their goal
-
They apparently had an explicit goal of a 100% rate of root-causing kernel crashes based on a single crash report. In order to strive for this goal they had built a lot of elaborate crash-reporting and debugging technology — it wasn’t just raw thinking hard about bugs — but at root I think this goal comes from the deep belief that their system, while complex, is understandable and mostly deterministic, and that they have the ability to reason about it.
Running this up the chain.
-
These bugs practically require finding someone on your team who is comfortable moving between multiple layers of the stack, in order to track down
todo: stop calling J$
-
checkout of that library’s source code on my laptop
todo
-
spend time learning more about the systems they depend on, and how they work and how they are implemented.
Blub studies
-
-
www.benkuhn.net www.benkuhn.net
-
Because of this compounding effect, the most important step toward becoming a blub master is to kickstart your “blub flywheel”—the virtuous cycle of blub accumulation—however you can.
Steps to get started on your studies
-
-
-
“the destiny of computing is to become interactive intellectual amplifiers for all humanity pervasively networked worldwide”.
Miles to go
-
-
paulgraham.com paulgraham.com
-
The reason it pays to put off even those errands is that real work needs two things errands don't: big chunks of time, and the right mood. If you get inspired by some project, it can be a net win to blow off everything you were supposed to do for the next few days to work on it. Yes, those errands may cost you more time when you finally get around to them. But if you get a lot done during those few days, you will be net more productive.
Reminder to self to read this post: https://fortelabs.co/blog/productivity-for-precious-snowflakes
-
-
artypapers.com artypapers.com
-
Look with suspicion upon every comfortable moment.
Quality life advice
-
“No one has the right to be an amateur in the matter of mental training. It is a shame for a person to grow old without seeing the beauty and strength of which the mind is capable.”
-
- Dec 2021
-
-
What does the output tell you?
Each byte is wrapped in a pair of single-quotes, which means they're runes.
-
-
comment.org comment.org
-
In the late nineteenth century, the wealthy taxpayers of Boston were convinced to build out water and sewage systems by a straightforward logic: every person in the city, rich or poor, needed clean water to drink every day. Without it, they would be at risk of contracting water-borne diseases like cholera. And with so many people in proximity, wealth alone couldn’t provide protection from contagious disease.
Lessons unlearned
-
-
techcrunch.com techcrunch.com
-
The life of a tech journalist can be soul-sapping. I talk to so many founders — capable, smart, well-connected, experienced founders — who are working on problems that genuinely just don’t matter. They might have successfully raised money. Perhaps they even rounded up a team of smart people like them. They are solving problems that even if they execute everything A++ and they change the world exactly in the way they envision, even if they make their board and VCs happy, it’s ultimately so fantastically futile. I don’t want to name any names. Not to inspire existential angst, but if you’re reading this, and you’re feeling a twinge of “god what am I doing with my life,” well, sorry, friend, I am talking about you.
Remember.
-
- Nov 2021
-
cavaliercoder.github.io cavaliercoder.github.io
-
I need to be more trusting and slightly less hardcore.
Don't ever change.
-
I’ve introduced a new benchmark, BenchmarkRand, to measure the overhead of the RNG. We can substract the resulting overhead from the other benchmarks for an approximation of their actual runtime.
Honest.
-
-
www.matthewball.vc www.matthewball.vc
-
And of course, nearly all of the internet would exist without them.
~of course~ crucially
-
VR headsets aren’t the Metaverse any more than smartphones are the mobile internet
Broaden perspectives and my thinking
-
like saying the mobile internet is an app
FB would like billions of people to believe this.
-
In other words, we will constantly be ‘within’ the internet, rather than have access to it, and within the billions of interconnected computers around us, rather than occasionally reach for them, and alongside all other users and real-time.
"within...interconnected computers around us"
How do IoT devices and the connected, networked home factor into the enablement of the Metaverse? Does my internet-connected washer become a minigame in "Clean Yer House," where games (read: dopamine dispensers) are created based on the current state of my environment? "'Candy Crush' but for emptying the dishwasher"
-
Once a few plants began this transformation, the entire market was forced to catch up, thereby spurring more investment and innovation in electricity-based infrastructure, equipment, and processes.
Keeping up with the Joneses
-
When plants first adopted electrical power, it was typically used for lighting and/or to replace a plant’s on-premises source of power (usually steam). These plants did not, however, rethink or replace the legacy infrastructure which would carry this power throughout the factory and put it to work.
Applications of new technology aren't just always not-obvious at first, but also difficult to implement when they're replacing an existing system (rather than building into a void).
-
- Oct 2021
-
research.swtch.com research.swtch.com
-
int is still 32 bits
This doesn't seem to be the case anymore (Oct '21): https://play.golang.org/p/9lpSXBkrWp9
-
- Jul 2021
-
www.dougengelbart.org www.dougengelbart.org
-
Our minds do not think inside of pages, files or apps. We think in concepts, we dart around fluidly at whatever level of detail suits the moment, connecting the dots, sparking aha moments. Information technology could be augmenting our collective human intellect at the speed of thought in powerful new ways, instead of automating how we used to think and work in a linear paper-based world.
Browsers should be able to support this. Instead of opening an interesting-looking link in the background, the preview appears in a sidebar, aligned with the link. Available, but not distracting from the main purpose of reading.
-
-
numinous.productions numinous.productions
-
This is certainly true in Silicon Valley, where it’s common to meet accomplished technical makers who, after reading a few stories from Richard Hamming and Richard Feynman, think they understand research well enough that they can “create the new Bell Labs”. Usually they’re victims of Dunning-Krugeritis, so ignorant they’re not even aware of their ignorance.
What are the ways to avoid falling into this trap? Research through double-loop learning?
-
- Jun 2021
-
www.theatlantic.com www.theatlantic.com
-
Rather than a single centralized network modeled after the legacy telephone system, operated by a government or a few massive utilities, the internet was designed to allow any device anywhere to interoperate with any other device, allowing any provider able to bring whatever networking capacity it had to the growing party.
And to be able to survive a nuclear attack, one of ARPA's motivations in funding the network. See "The Dream Machine" for more history.
-