722 Matching Annotations
  1. Last 7 days
    1. Scientists at EPFL in Switzerland have shown that you are more likely to initiate a voluntary decision as you exhale.

      We make conscious decisions when we breathe out, says new study involving 52 people pressing a button, monitored with brain, heart and lung sensors.

    1. I'm fond of gitmoji commit convention. It lies on categorizing commits using emojies. I'm a visual person so it fits well to me but I understand this convention is not made for everyone.

      You can add gitmojis (emojis) in your commits, such as:

      :recycle: Make core independent from the git client (#171)
      :whale: Upgrade Docker image version (#167)

      which will transfer on GitHub/GitLab to:

      ♻️ Make core independent from the git client (#171)
      🐳 Upgrade Docker image version (#167)
    2. If you use practices like pair or mob programming, don't forget to add your coworkers names in your commit messages

      It's good to give a shout-out to developers who collaborated on the commit. For example:

      $ git commit -m "Refactor usability tests.
      Co-authored-by: name <name@example.com>
      Co-authored-by: another-name <another-name@example.com>"
    3. Separate subject from body with a blank line Limit the subject line to 50 characters Capitalize the subject line Do not end the subject line with a period Use the imperative mood in the subject line Wrap the body at 72 characters Use the body to explain what and why vs. how

      7 rules of good commit messages.

      >more info<

    1. A combinaison of split(), subString(), removePrefix(), removeSuffix() is usually enough.

      Sometimes the following functions are more than enough for your string matching problems

  2. Feb 2020
    1. An exploratory plot is all about you getting to know the data. An explanatory graphic, on the other hand, is about telling a story using that data to a specific audience.

      Exploratory vs Explanatory plot

  3. Jan 2020
    1. The nonlocal keyword is used to work with variables inside nested functions, where the variable should not belong to the inner function. Use the keyword nonlocal to declare that the variable is not local.

      nonlocal is used to declare variables inside the nested functions.

      Example (if nonlocal wouldn't be used, the output would be "John", not "hello"):

      def myfunc1():
        x = "John"
        def myfunc2():
          nonlocal x
          x = "hello"
        return x
    1. We only need to use global keyword in a function if we want to do assignments / change them. global is not needed for printing and accessing.

      global inside a function is:

      • required if we want to assign/change variable
      • not required if we just want to print/access a variable.

      Example code:

      # This function modifies global variable 's' 
      def f(): 
          global s 
          print s 
          s = "Look for Geeksforgeeks Python Section"
          print s  
      # Global Scope 
      s = "Python is great!" 
      print s
    1. Several cryptocurrencies use DAGs rather than blockchain data structures in order to process and validate transactions.

      DAG vs Blockchain:

      • DAG transactions are linked to each other rather than grouped into blocks
      • DAG transactions can be processed simultaneously with others
      • DAG results in a lessened bottleneck on transaction throughput. In blockchain it's limited, such as transactions that can fit in a single block
    2. graph data structure that uses topological ordering, meaning that the graph flows in only one direction, and it never goes in circles.

      Simple definition of Directed Acyclic Graph (DAG)

    3. 'Directed' means that the edges of the graph only move in one direction, where future edges are dependent on previous ones.

      Meaning of "directed" in Directed Acyclic Graph

    4. 'Acyclic' means that it is impossible to start at one point of the graph and come back to it by following the edges.

      Meaning of "acyclic" in Directed Acyclic Graph

    1. When you get tired of thinking about a piece of work and feeling bad for not finishing it yet, go "screw it, let's do it" and start with something, anything.

      One way of starting to do what we postpone

    2. When you know that you don't have to make the greatest thing ever right from the start, it's easier to start. And then it's easier to continue.

      Apply the MVP principle to start doing

    1. UI layouts are represented as component trees. And XML is ideal for representing tree structures. It’s a match made in heaven! In fact, the most popular UI frameworks in the world (HTML and Android) use XML syntax to define layouts.

      XML works great for displaying UI layouts

    2. JSON is well-suited for representing lists of objects with complex properties. JSON’s key/value object syntax makes it easy. By contrast, XML’s attribute syntax only works for simple data types. Using child elements to represent complex properties can lead to inconsistencies or unnecessary verbosity.

      JSON works well for list of objects with complex properties. XML not so much

    3. The advantages of XML over JSON for trees becomes more pronounced when we introduce different node types. Assume we wanted to introduce departments into the org chart above. In XML, we can just use an element with a new tag name
    4. XML may not be ideal to represent generic data structures, but it excels at representing one particular structure: the tree. By separating node data (attributes) from parent/child relationships, the tree structure of the data shines through, and the code to process the data can be quite elegant.

      XML is good for representing tree structured data

    5. A particular strength of JSON is its support for nested data structures

      JSON can facilitate arrays, such as:

      "favorite_movies": [ "Diehard", "Shrek" ]
    6. JSON’s origins as a subset of JavaScript can be seen with how easily it represents key/value object data. XML, on the other hand, optimizes for document tree structures, by cleanly separating node data (attributes) from child data (elements)

      JSON for key/value object data

      XML for document tree structures (clearly separating node data (attributes) from child data (elements))

    7. Each format makes tradeoffs in encoding, flexibility, and expressiveness to best suit a specific use case.

      Each data format brings different tradeoffs:

      • A format optimized for size will use a binary encoding that won’t be human-readable.
      • A format optimized for extensibility will take longer to decode than a format designed for a narrow use case.
      • A format designed for flat data (like CSV) will struggle to represent nested data.
    1. Since we have much faster CPUs now, numerical calculations are done in Python which is much slower than Fortran. So numerical calculations basically take the same amount of time as they did 20 years ago.

      Python vs Fortran ;)

    2. Your project has no business value today unless it includes blockchain and AI, although a centralized and rule-based version would be much faster and more efficient.

      Comparing current project needs to those 20 years ago

    3. There is StackOverflow which simply didn’t exist back then. Asking a programming question involved talking to your colleagues.

      20 years ago StackOverflow wouldn't give you a hand

    4. Cross-platform development is now a standard because of wide variety of architectures like mobile devices, cloud servers, embedded IoT systems. It was almost exclusively PCs 20 years ago.
    5. IDEs and the programming languages are getting more and more distant from each other. 20 years ago an IDE was specifically developed for a single language, like Eclipse for Java, Visual Basic, Delphi for Pascal etc. Now, we have text editors like VS Code that can support any programming language with IDE like features.

      How IDEs "unified" in comparison to the last 20 years

    6. Language tooling is richer today. A programming language was usually a compiler and perhaps a debugger. Today, they usually come with the linter, source code formatter, template creators, self-update ability and a list of arguments that you can use in a debate against the competing language.

      How coding became much more supported in comparison to the last 20 years

    7. Being a software development team now involves all team members performing a mysterious ritual of standing up together for 15 minutes in the morning and drawing occult symbols with post-its.

      In comparison to 20 years ago ;)

    8. A package management ecosystem is essential for programming languages now. People simply don’t want to go through the hassle of finding, downloading and installing libraries anymore. 20 years ago we used to visit web sites, downloaded zip files, copied them to correct locations, added them to the paths in the build configuration and prayed that they worked.

      How library management changed in 20 years

    1. Level 0 is no automation whatsoever. Level 1 is partial assistance with certain aspects of driving, like lane keep assist or adaptive cruise control. Level 2 is a step up to systems that can take control of the vehicle in certain situations, like Tesla's Autopilot or Cadillac's Super Cruise, while still requiring the driver to pay attention. Get past that and we enter the realm of speculation: Level 3 promises full computer control under defined conditions during a journey, Level 4 expands that to start-to-finish autonomous tech limited only by virtual safeguards like a geofence, and Level 5 is the total hands-off, go-anywhere-at-the-push-of-a-button experience.

      Description of 6 levels defining autonomous cars:

      1. Level 0 - no automation.
      2. Level 1 - partial assistance with certain aspects of driving, like lane keep assist or adaptive cruise control.
      3. Level 2 - step up to systems that can take control of the vehicle in certain situations, like Tesla's Autopilot or Cadillac's Super Cruise, while still requiring the driver to pay attention.
      4. Level 3 - promises full computer control under defined conditions during a journey.
      5. Level 4 - expands that to start-to-finish autonomous tech limited only by virtual safeguards like a geofence.
      6. Level 5 - total hands-off, go-anywhere-at-the-push-of-a-button experience.
    2. The CEO of Volkswagen's autonomous driving division recently admitted that Level 5 autonomy—that's full computer control of the vehicle with zero limitations—might actually never happen.
    1. Majority of the times, the only way to break into a circle is for someone within that circle to speak positively on your behalf.

      Who speaks on your behalf?!

    2. I have observed something else under the sun. The fastest runner doesn’t always win the race, and the strongest warrior doesn’t always win the battle. The wise sometimes go hungry, and the skillful are not necessarily wealthy. And those who are educated don’t always lead successful lives. It is all decided by chance, by being in the right place at the right time. — Ecclesiastes 9:11
    3. They simply possess the willpower and drive to observe people, get to know people, appear in gatherings that involve people that are aligned with their goals, and connect people with one another.

      Skill that puts sometimes the smartest minds below you

    1. This quote from Richard Feynman is at the top of my blog’s landing page: I learned very early the difference between knowing the name of something and knowing something.

      Inspiration to maintain a research blog

    2. Summarizing a paper in your own words restructures the content to focus on learning rather than novelty.

      In the scientific papers we convey novelty, hence, some of the early readers might confuse themselves that this is the right way to speak in a daily scientific community

    3. Blogging has taught me how to read a paper because explaining something is a more active form of understanding. Now I summarize the main contribution in my own words, write out the notation and problem setup, define terms, and rederive the main equations or results. This process mimics the act of presenting and is great practice for it.

      Why teaching others/blogging has a great value in terms of learning new topics

    4. When I first started teaching myself to program, I felt that I had no imagination. I couldn’t be creative because I was too focused on finding the syntax bug or reasoning about program structure. However, with proficiency came creativity. Programming became less important than what I was building and why.

      While learning, don't worry about the creativity, which shall come after gaining proficiency (knowledge base)

    5. In my opinion the reason most people fail to do great research is that they are not willing to pay the price in self-development. Say some new field opens up that combines field XXX and field YYY. Researchers from each of these fields flock to the new field. My experience is that virtually none of the researchers in either field will systematically learn the other field in any sort of depth. The few who do put in this effort often achieve spectacular results.

      I think we all know that...

    6. Many of us have done this on exams, hoping for partial credit by stitching together the outline of a proof or using the right words in an essay with the hopes that the professor connects the dots for us.

      Often we tend to communicate with a jargon we don't understand just to pretend we know something

    1. In multi-class model, we can plot N number of AUC ROC Curves for N number classes using One vs ALL methodology. So for Example, If you have three classes named X, Y and Z, you will have one ROC for X classified against Y and Z, another ROC for Y classified against X and Z, and a third one of Z classified against Y and X.

      Using AUC ROC curve for multi-class model

    2. When AUC is approximately 0, model is actually reciprocating the classes. It means, model is predicting negative class as a positive class and vice versa

      If AUC = 0

    3. when AUC is 0.5, it means model has no class separation capacity whatsoever.

      If AUC = 0.5

    4. AUC near to the 1 which means it has good measure of separability.

      If AUC = 1

    5. ROC is a probability curve and AUC represents degree or measure of separability. It tells how much model is capable of distinguishing between classes.

      ROC & AUC

    1. Work never ends. No matter how much you get done there will always be more. I see a lot of colleagues burn out because they think their extra effort will be noticed. Most managers appriciate it but do not promote their employees.

      Common reality of overworking

    1. “No Code” systems are extremely good for putting together proofs-of-concept which can demonstrate the value of moving forward with development.

      Great point of "no code" trend

    2. With someone else’s platform, you often end up needing to construct elaborate work-arounds for missing functionality, or indeed cannot implement a required feature at all.

      You can quickly implement 80% of the solution in Salesforce using a mix of visual programming (basic rule setting and configuration), but later it's not so straightforward to add the missing 20%

    3. the developer doesn’t need to worry about allocating memory, or the character set encoding of the string, or a host of other things.

      Comparison of C (1972) and TypeScript (2012) code.

      (check the code above)

    4. First, you’ve spread the logic across a variety of different systems, so it becomes more difficult to reason about the application as a whole. Second, more importantly, the logic has been implemented as configuration as opposed to code. The logic is constrained by the ability of the applications which have been wired together, but it’s still there.

      Why "no code" trend is dangerous in some way (on the example of Zapier):

      1. You spread the logic across multiple systems.
      2. Logic is maintained in configuration rather than code.
    1. Sometimes, the best way to learn is to mimic others. Here are some great examples of projects that use documentation well:

      Examples of projects that use documentation well

      (chech the list below)

    2. Along with these tools, there are some additional tutorials, videos, and articles that can be useful when you are documenting your project

      Recommended videos to start documenting

      (check the list below)

    3. Documenting your code, especially large projects, can be daunting. Thankfully there are some tools out and references to get you started

      You can always facilitate documentation with tools.

      (check the table below)

    4. Daniele Procida gave a wonderful PyCon 2017 talk and subsequent blog post about documenting Python projects. He mentions that all projects should have the following four major sections to help you focus your work:

      Public and Open Source Python projects should have the docs folder, and inside of it:

      • Tutorials
      • How-To Guides
      • References
      • Explanations

      (check the table below for a summary)

    5. The general layout of the project and its documentation should be as follows:
      ├── project/  # Project source code
      ├── docs/
      ├── README
      ├── examples.py

      (private, shared or open sourced)

    6. There are specific docstrings formats that can be used to help docstring parsers and users have a familiar and known format.

      Different docstring formats:

    7. If you use argparse, then you can omit parameter-specific documentation, assuming it’s correctly been documented within the help parameter of the argparser.parser.add_argument function. It is recommended to use the __doc__ for the description parameter within argparse.ArgumentParser’s constructor.


    8. Scripts are considered to be single file executables run from the console. Docstrings for scripts are placed at the top of the file and should be documented well enough for users to be able to have a sufficient understanding of how to use the script.

      Docstrings in scripts

    9. class constructor parameters should be documented within the __init__ class method docstring


    10. Class method docstrings should contain the following: A brief description of what the method is and what it’s used for Any arguments (both required and optional) that are passed including keyword arguments Label any arguments that are considered optional or have a default value Any side effects that occur when executing the method Any exceptions that are raised Any restrictions on when the method can be called

      Class method should contain:

      • brief description
      • arguments
      • label on default/optional arguments
      • side effects description
      • raised exceptions
      • restrictions on when the method can be called

      (check example below)

    11. Docstrings can be further broken up into three major categories: Class Docstrings: Class and class methods Package and Module Docstrings: Package, modules, and functions Script Docstrings: Script and functions

      3 main categories of docstrings

    12. All multi-lined docstrings have the following parts: A one-line summary line A blank line proceeding the summary Any further elaboration for the docstring Another blank line

      Multi-line docstring example:

      """This is the summary line
      This is the further elaboration of the docstring. Within this section,
      you can elaborate further on details as appropriate for the situation.
      Notice that the summary and the elaboration is separated by a blank new
      # Notice the blank line above. Code should continue on this line.
    13. In all cases, the docstrings should use the triple-double quote (""") string format.

      Think only about """ when using docstrings

    14. Docstring conventions are described within PEP 257. Their purpose is to provide your users with a brief overview of the object.

      Docstring conventions

    15. say_hello.__doc__ = "A simple function that says hello... Richie style"

      Example of using __doc:

      Code (version 1):

      def say_hello(name):
          print(f"Hello {name}, is it me you're looking for?")
      say_hello.__doc__ = "A simple function that says hello... Richie style"

      Code (alternative version):

      def say_hello(name):
          """A simple function that says hello... Richie style"""
          print(f"Hello {name}, is it me you're looking for?")


      >>> help(say_hello)


      Help on function say_hello in module __main__:
          A simple function that says hello... Richie style
    16. Since everything in Python is an object, you can examine the directory of the object using the dir() command

      dir() function examines directory of Python objects. For example dir(str).

      Inside dir(str) you can find interesting property __doc__

    17. Along with docstrings, Python also has the built-in function help() that prints out the objects docstring to the console.

      help() function.

      After typing help(str) it will return all the info about str object

    18. Documenting your Python code is all centered on docstrings. These are built-in strings that, when configured correctly, can help your users and yourself with your project’s documentation.

      Docstrings - built-in strings that help with documentation

    19. From examining the type hinting, you can immediately tell that the function expects the input name to be of a type str, or string. You can also tell that the expected output of the function will be of a type str, or string, as well.

      Type hinting introduced in Python 3.5 extends 4 rules of Jeff Atwood and comments the code itself, such as this example:

      def hello_name(name: str) -> str:
          return(f"Hello {name}")
      • user knows that the code expects input of type str
      • the same about output
    20. Comments to your code should be kept brief and focused. Avoid using long comments when possible. Additionally, you should use the following four essential rules as suggested by Jeff Atwood:

      Comments should be as concise as possible. Moreover, you should follow 4 rules of Jeff Atwood:

      1. Keep comments close to the code being described.
      2. Don't use complex formatting (such as tables).
      3. Don't comment obvious things.
      4. Design code in a way it comments itself.
    21. Commenting your code serves multiple purposes

      Multiple purposes of commenting:

      • planning and reviewing code - setting up a code template
      • code description
      • algorithmic description - for example, explaining the work of an algorithm or the reason of its choice
      • tagging - BUG, FIXME, TODO
    22. In general, commenting is describing your code to/for developers. The intended main audience is the maintainers and developers of the Python code. In conjunction with well-written code, comments help to guide the reader to better understand your code and its purpose and design

      Commenting code:

      • describing code to/for developers
      • help to guide the reader to better understand your code, its purpose/design
    23. Documenting code is describing its use and functionality to your users. While it may be helpful in the development process, the main intended audience is the users.

      Documenting code:

      • describing use to your users (main audience)
    24. According to PEP 8, comments should have a maximum length of 72 characters.

      If comment_size > 72 characters:

      use `multiple line comment`
    25. “Code is more often read than written.” — Guido van Rossum
    1. Dutch programmer Guido van Rossum designed Python in 1991, naming it after the British television comedy Monty Python's Flying Circus because he was reading the show's scripts at the time.

      Origins of Python name

    1. Most VQA models would use some kind of Recurrent Neural Network (RNN) to process the question input
      • Most VQA will use RNN to process the question input
      • Easier VQA datasets shall be fine with using BOW to transport vector input to a standard (feedforward) NN
    2. Here’s a very simple example of how a VQA system might answer the question “what color is the triangle?”
      1. Look for shapes and colours using CNN.
      2. Understand the question type with NLP.
      3. Determine strength for each possible answer.
      4. Convert each answer strength to % probability
    3. The standard approach to performing VQA looks something like this: Process the image. Process the question. Combine features from steps 1/2. Assign probabilities to each possible answer.

      Approach to handle VQA problems:


    4. Visual Question Answering (VQA): answering open-ended questions about images. VQA is interesting because it requires combining visual and language understanding.

      Visual Question Answering (VQA) = visual + language understanding

    1. Softmax turns arbitrary real values into probabilities

      Softmax function -

      • outputs of the function are in range [0,1] and add up to 1. Hence, they form a probability distribution
      • the calcualtion invloves e (mathematical constant) and performs operation on n numbers: $$s(x_i) = \frac{e^{xi}}{\sum{j=1}^n e^{x_j}}$$
      • the bigger the value, the higher its probability
      • lets us answer classification questions with probabilities, which are more useful than simpler answers (e.g. binary yes/no)
    1. LR is nothing but the binomial regression with logit link (or probit), one of the numerous GLM cases. As a regression - itself it doesn't classify anything, it models the conditional (to linear predictor) expected value of the Bernoulli/binomially distributed DV.

      Linear Regression - the ultimate definition (it's not a classification algorithm!)

      It's used for classification when we specify a 50% threshold.

    1. Since water is denser than air, and the reflection is diffuse. A lot of light is internally reflected, thereof, increasing the probability of absorption at surface.

      The light is reflected back inside the water, because of the total internal reflection:

      • water is denser than air
      • angle of incidence is greater than the so-called critical angle

    2. This is because the light now has a layer of water to go through. And due to the reflectance of water, not all light at the air-liquid-interface (border between air and water) goes through the water. Some of it is reflected.

      Wet things become darker, because of the water consistency, reflectance that doesn't let all the light to transmit through it.

      The probability of light getting transmitted is: 1 - R1 (reflectance at the air-liquid interface)

    3. There are two types of reflection (two ways the wave can be thrown back). Specular Diffuse

      Two types of reflection:

      1. specular - light leaves the surface at the same angle it hits it
      2. diffuse - hitting light is scattered into all angles when reflected
    1. 1. Logistic regression IS a binomial regression (with logit link), a special case of the Generalized Linear Model. It doesn't classify anything *unless a threshold for the probability is set*. Classification is just its application. 2. Stepwise regression is by no means a regression. It's a (flawed) method of variable selection. 3. OLS is a method of estimation (among others: GLS, TLS, (RE)ML, PQL, etc.), NOT a regression. 4. Ridge, LASSO - it's a method of regularization, NOT a regression. 5. There are tens of models for the regression analysis. You mention mainly linear and logistic - it's just the GLM! Learn the others too (link in a comment). STOP with the "17 types of regression every DS should know". BTW, there're 270+ statistical tests. Not just t, chi2 & Wilcoxon

      5 clarifications to common misconceptions shared over data science cheatsheets on LinkedIn

    1. ericb 12 days ago | unvote [-] * Better googling. Time-restricted, url restricted, site restricted searches. Search with the variant parts of error messages removed.* Read the source of upstream dependencies. Fix or fork them if needed.* They're better at finding forks with solutions and gleaning hints from semi-related issues.* Formulate more creative hypothesis when obvious lines of investigation run out. The best don't give up.* Dig in to problems with more angles of investigation.* Have more tools in their toolbelt for debugging like adding logging, monkey-patching, swapping parts out, crippling areas to rule things out, binary search of affected code areas.* Consider the business.* Consider user-behavior.* Assume hostile users (security-wise).* Understand that the UI is not a security layer. Anything you can do with PostMan your backend should handle.* Whitelist style-security over blacklist style.* See eventual problems implied by various solutions.* "The Math."

      What do top engineers do that others don't?

      • Better googling. Time-restricted, url restricted, site restricted searches. Search with the variant parts of error messages removed.
      • Read the source of upstream dependencies. Fix or fork them if needed.
      • They're better at finding forks with solutions and gleaning hints from semi-related issues.
      • Formulate more creative hypothesis when obvious lines of investigation run out. The best don't give up.
      • Dig in to problems with more angles of investigation.
      • Have more tools in their toolbelt for debugging like adding logging, monkey-patching, swapping parts out, crippling areas to rule things out, binary search of affected code areas.
      • Consider the business.
      • Consider user-behavior.
      • Assume hostile users (security-wise).
      • Understand that the UI is not a security layer. Anything you can do with PostMan your backend should handle.
      • Whitelist style-security over blacklist style.
      • See eventual problems implied by various solutions.
      • "The Math."
    1. technology diffused more easily along lines of latitude than along lines of longitude because climate changed more rapidly along lines of longitude making it more difficult for both humans and technologies to adapt

      Technology adapts better across latitude than longitude

    2. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.

      Before 15 it's just a part of our world creation, after 35 it's the natural order of things

    3. "One of history’s few iron laws is that luxuries tend to become necessities and to spawn new obligations. Once people get used to a certain luxury, they take it for granted. Then they begin to count on it. Finally they reach a point where they can’t live without it.”

      Be careful what becomes your necessity to live

    4. A 90s study showed that women preferred the scents of men whose immune systems were most different from their own immune-system genes. Evolutionarily this makes sense as, children should be healthier if their parents’ genes vary, protecting them from more pathogens.

      Why women are attracted to opposite scents in a men

    5. When People Work Together

      How to lay off your lovely co-workers

    6. Between 2011 and 2013, china used 50% more cement than the United States in the 20th century.Of the world’s 100 highest bridges, 81 are in China, including some unfinished ones.

      China's infrastructure is growing amazingly fast

    7. Here’s Warren Buffett: “Cola has no taste memory. You can drink one at 9am, 11am, 5pm. You can't do that with cream soda, root beer, orange, grape. You get sick of them after a while. The average person drinks 64 ounces of liquid per day, and you can have all 64 ounces of that be Coke.”Same with Doritos, Cheetos, most popular junk food. They are engineered to overcome “sensory-specific satiety” and to give a sense of “vanishing caloric density.”

      Why chips and coca-cola are addicting:

      the taste is vanishing

    8. One of the tasks of true friendship is to listen compassionately and creatively to the hidden silences. Often secrets are not revealed in words, they lie concealed in the silence between words.

      Listen to the silence

    9. Video Games are a Booming IndustryThe video game industry generates more revenue than movies and music.

      Revenue raised by videogames > films + music

    10. Humor treads at the frontier of consciousness. When a comic finds a funny joke, they are unearthing a truth that people are only kind of aware of, but the whole room grasps that everybody else is aware of the truth, and laughter ensues.

      Humor as a comprehension

    11. Highest Grossing Media Franchises

      What type of media sells the most:

    12. How the Sun Moves
    13. “Twitter is the most amazing networking and learning network ever built.For someone whose pursuing their dream job, or chasing a group of mentors or peers, it’s remarkable. In any given field, 50-80% of the top experts in that field are on Twitter and they’re sharing ideas, and you can connect to them or follow them in your personal feed.If you get lucky enough and say something they find interesting, they might follow you, and the reason this becomes super interesting is that unlocks direct message, and now all of a sudden you can communicate directly or electronically with that individual. Very, very powerful.If you’re not using Twitter, you’re missing out.” — Bill Gurley

      I cannot agree more on this, since I finally accumulated a great network on Twitter. It's important to hit the bell icon next to the profiles we value the most, so that we're never missing out of their new content

    14. 80% of Americans live, work, and hang out in the pink areas — 3.6 percent of the landmass of the lower 48 states.

      Map where 80% of Americans' life is

    15. "The easiest way to be discovered right now in technology and perhaps many fields is to create your own independent blog and write. There is a huge dearth in availability of good, current, first party content today.The single most important advice I can give to actually write is to write.The thing that happens which you don’t see until you write is that your content engages some of the smartest people who are lurking around the internet. And they reach out to you."

      Totally agree with this ;)

    16. By some estimates, more than 50,000 pieces of artwork are stolen each year, amounting to annual losses of around $6 to $8 billion globally. This makes art theft one of the most valuable criminal enterprises around, exceeded only by drug trafficking and arms dealing.

      Art crime is even more serious than we think

    17. Where is Wealth Concentrated?

      Map of wealth concentration

    18. This map shows the relative density of commercial shipping around the world. The darker the color, the busier the route.

      Map of global shipping routes

    19. I've noticed a weird pattern: In most of the best marriages I see, one person is an early-bird, and the other is a night-owl. They have opposite circadian rhythms.I think this is healthy. The two partners get the benefits of time together and time alone, which helps the marriage.

      Circadian rhythm as a way to validate a great marriage

    20. look around and figure out who you want to be on your team. Figure out the people around you that you want to work with for the rest of your life. Figure out the people who are smart & awesome, who share your values, who get things done — and maybe most important, who you like to be with and who you want to help win. And treat them right, always. Look for ways to help, to work together, to learn. Because in 20 years you’ll all be in amazing places doing amazing things.

      One of the best life advises some can get

    1. It goes completely against what most believe, but out of all major energy sources, nuclear is the safest

      Nuclear energy as the safest energy

  4. Dec 2019
    1. Continuous Delivery of Deployment is about running as thorough checks as you can to catch issues on your code. Completeness of the checks is the most important factor. It is usually measured in terms code coverage or functional coverage of your tests. Catching errors early on prevents broken code to get deployed to any environment and saves the precious time of your test team.

      Continuous Delivery of Deployment (quick summary)

    2. Continuous Integration is a trade off between speed of feedback loop to developers and relevance of the checks your perform (build and test). No code that would impede the team progress should make it to the main branch.

      Continuous Integration (quick summary)

    3. A good CD build: Ensures that as many features as possible are working properly The faster the better, but it is not a matter of speed. A 30-60 minutes build is OK

      Good CD build

    4. A good CI build: Ensures no code that breaks basic stuff and prevents other team members to work is introduced to the main branch Is fast enough to provide feedback to developers within minutes to prevent context switching between tasks

      Good CI build

    5. Continuous Deployment is the next step. You deploy the most up to date and production ready version of your code to some environment. Ideally production if you trust your CD test suite enough.

      Continuous Deployment

    6. The idea of Continuous Delivery is to prepare artefacts as close as possible from what you want to run in your environment. These can be jar or war files if you are working with Java, executables if you are working with .NET. These can also be folders of transpiled JS code or even Docker containers, whatever makes deploy shorter (i.e. you have pre built in advance as much as you can).

      Idea of Continuous Delivery

    7. Continuous Delivery is about being able to deploy any version of your code at all times. In practice it means the last or pre last version of your code.

      Continous Delivery

    8. Studies show it takes ~23 minutes to deeply refocus on something when you get disturbed
    9. Continuous Integration is not about tools. It is about working in small chunks and integrating your new code to the main branch and pulling frequently.

      Continuous Integration is not about tools

    10. The app should build and start Most critical features should be functional at all times (user signup/login journey and key business features) Common layers of the application that all the developers rely on, should be stable. This means unit tests on those parts.

      Things to be checked by Continous Integration

    11. Continuous Integration is all about preventing the main branch of being broken so your team is not stuck. That’s it. It is not about having all your tests green all the time and the main branch deployable to production at every commit.

      Continuous Integration prevents other team members from wasting time through a pull of faulty code

  5. unix4lyfe.org unix4lyfe.org
    1. if you care at all about storing timestamps in MySQL, store them as integers and use the UNIX_TIMESTAMP() and FROM_UNIXTIME() functions.

      MySQL does not store offset

    2. The system clock is inaccurate. You're on a network? Every other system's clock is differently inaccurate.

      System clocks are inaccurate

    3. A time format without an offset is useless.

      Always use timezone offset

    4. If you want to store a humanly-readable time (e.g. logs), consider storing it along with Unix time, not instead of Unix time.

      Logs should be stored along with Unix time

    5. When storing a timestamp, store Unix time. It's a single number.

      Store Unix timestamps (single numbers)

    6. Most of your code shouldn't be dealing with timezones or local time, it should be passing Unix time around.

      Most of your code should use Unix time

    7. Unix time: Measured as the number of seconds since epoch (the beginning of 1970 in UTC). Unix time is not affected by time zones or daylight saving

      Unix time - # of seconds since 1970 in UTC

    8. GMT is still used: it's the British national timezone in winter. In summer it becomes BST.

      GMT and BST in Britain

    9. Australian Eastern Standard Time is UTC+1000. e.g. 10:00 UTC is 20:00 EST on the same day.

      Example of UTC offset

    10. GMT: UTC used to be called Greenwich Mean Time (GMT) because the Prime Meridian was (arbitrarily) chosen to pass through the Royal Observatory in Greenwich.

      GMT - previous name of UTC

    11. UTC: The time at zero degrees longitude (the Prime Meridian) is called Coordinated Universal Time (UTC is a compromise between the French and English initialisms)


    1. The two things I really like about working for smaller places or starting a company is you get very direct access to users and customers and their problems, which means you can actually have empathy for what's actually going on with them, and then you can directly solve it. That cycle is so powerful, the sooner you learn how to make that cycle happen in your career, the better off you'll be. If you can make software and make software for other people, the outcome truly is hundreds of millions of dollars worth of value if you get it right. That's where I'm here to try and encourage you to do. I'm not really saying that you shouldn't go work at a big tech company. I am saying you should probably leave before it makes you soft. 

      What are the benefits of working at the smaller companies/startups over the tech giants

    1. There is a good amount of prerequisite knowledge required to set up a Gatsby site - HTML, CSS, JavaScript, ES6, Node.js development environment, React, and GraphQL are the major ones.

      There's a bit of technologies to be familiar with before setting up a GatsbyJS blog:

      • HTML
      • CSS
      • JavaScript
      • ES6
      • Node.js
      • React
      • GraphQL

      but you can be fine with the Gatsby Getting Started Tutorial

    2. I feel great that all of my posts are now safely saved in version control and markdown. It’s a relief for me to know that they’re no longer an HTML mess inside of a MySQL database, but markdown files which are easy to read, write, edit, share, and backup.

      Good feeling of switching to GatsbyJS

    3. I’ll give you the basics of what I did in case you also want to make the switch.

      (check the text below this highlight for a great guide of migrating from WordPress to GatsbyJS)

    4. I had over 100 guides and tutorials to migrate, and in the end I was able to move everything in 10 days, so it was far from the end of the world.

      If you're smart, you can move from WordPress to GatsbyJS in ~ 10 days

    5. A few things I really like about Gatsby

      Main benefits of GatsbyJS:

      • No page reloads
      • Image optimisation
      • Pre-fetch resources
      • Bundling and minification
      • Server-side rendered, at build time
      • Articles are saved in beautiful Markdown
      • Using Netlify your sites automatically updates while pushing the repo
    6. However, I realized that a static site generator like Gatsby utilizes the power of code/data splitting, pre-loading, pre-caching, image optimization, and all sorts of performance enhancements that would be difficult or impossible to do with straight HTML.

      Benefits of mixing HTML/CSS with some JavaScript (GatsbyJS):

      • code/data splitting
      • pre-loading
      • pre-caching
      • image optimisation
      • performance enhancements impossible with HTML
    7. There are a lot of static site generators to choose from. Jekyll, Hugo, Next, and Hexo are some of the big ones, and I’ve heard of and looked into some interesting up-and-coming SSGs like Eleventy as well.

      Other statis site generators to consider, apart from GatsbyJS:

    1. Gatsby is SEO friendly – it is part of the JAMStack after all!

      With Gatsby you don't have to worry about SEO

    2. Gatsby is a React based framework which utilises the powers of Webpack and GraphQL to bundle real React components into static HTML, CSS and JS files. Gatsby can be plugged into and used straight away with any data source you have available, whether that is your own API, Database or CMS backend (Spoiler Alert!).

      Good GatsbyJS explanation in a single paragraph

    1. we shield ourselves from existential threats, or consciously thinking about the idea that we are going to die, by shutting down predictions about the self,” researcher Avi Goldstein told The Guardian, “or categorizing the information as being about other people rather than ourselves.

      Magically, our brain doesn't easily accept the fact that we will die some day. It was proved by the short experiment:

      volunteers were watching images of faces with words like "funeral" or "burial", and whenever they've seen their own one, the brain didn't showcase any surprise signals

    1. test of whether they know how to look for help.Are they able to read a manual?Can they formulate a search query?How do they assess whether the tutorial they found is suitable or reliable?What steps do they take to make sure they're finding - and learning - the right information?

      Interesting approach to hiring: put someone in front of an unfamiliar program, make them complete a set of tasks and observe how they look for help.

    1. Zugzwang (German for "compulsion to move", pronounced [ˈtsuːktsvaŋ]) is a situation found in chess and other games wherein one player is put at a disadvantage because they must make a move when they would prefer to pass and not move

      Zugzwang - I need to remember this word!

    1. Today, my process is enjoyably unsophisticated. When I want to post something, I first write it in a text file, copy my last blog post’s HTML file, paste in my new article, make some slight adjustments, update my list of posts, add it to my RSS file, and that’s basically it. Any page on my website can be anything I want it to be, like how, for example, double clicking on this article leads to a small easter egg.

      Interesting approach on ignoring any type of site generators

    1. I want to get the selected number of bins from the slider and pass that number into a python method and do some calculation/manipulation (return: “You have selected 30bins and I came from a Python Function”) inside of it then return some value back to my R Shiny dashboard and view that result in a text field.

      Using Python scripts inside R Shiny (in 6 steps):

      1. In ui.R create textOutput: textOutput("textOutput") (after plotoutput()).
      2. In server.R create handler: output$textOutput <- renderText({ }].
      3. Create python_ref.py and insert this code:
      4. Import reticulate library: library(reticulate).
      5. source_python() function will make Python available in R:
      6. Make sure you've these files in your directory:
      • app.R
      • python_ref.py and that you've imported the reticulate package to R Environment and sourced the script inside your R code.

      Hit run.

    2. We save all of this code, the ui object, the server function, and the call to the shinyApp function, in an R script called app.R

      The same basic structure for all Shiny apps:

      1. ui object.
      2. server function.
      3. call to the shinyApp function.

      ---> examples <---

    3. server

      Server example of a Shiny app (check the code below):

      • random distribution is plotted as a histogram with the requested number of bins
      • code that generates the plot is wrapped in a call to renderPlot
    4. ui

      UI example of a Shiny app (check the code below)

    5. You can either create a one R file named app.R and create two seperate components called (ui and server inside that file) or create two R files named ui.R and server.R

    6. Currently Shiny is far more mature than Dash. Dash doesn’t have a proper layout tool yet, and also not build in theme, so if you are not familiar with Html and CSS, your application will not look good (You must have some level of web development knowledge). Also, developing new components will need ReactJS knowledge, which has a steep learning curve.

      Shiny > Dash:

      • Dash isn't yet as stabilised
      • Shiny has much more layout options, whereas in Dash you need to utilise HTML and CSS
      • developing new components in Dash needs ReactJS knowledge (not so easy)
    7. You can host standalone apps on a webpage or embed them in R Markdown documents or build dashboards. You can also extend your Shiny apps with CSS themes, Html widgets, and JavaScript actions.

      Typical tools used for working with Shiny

    1. walking, cycling and more green spaces not only cut air pollution but also improved mental health.

      Quick tip: try to omit public transportation when possible, take a walk and influence others

    2. “Although the studies included were from different parts of the world – eg China, the US, Germany – and varied in sample size, study design and measures of depression, the reported associations were very similar.”

      The studies proved that the cultural background did not have a direct impact on the results

    3. The studies analysed took account of many factors that might affect mental health, including home location, income, education, smoking, employment and obesity. But they were not able to separate the potential impact of noise, which often occurs alongside air pollution and is known to have psychological effects.

      During the study behind the effect of air pollution, other factors were taken into the consideration as well

    4. more than 90% of the global population lives with air pollution above WHO-recommended levels

      Devastating news

    5. People exposed to an increase of 10 micrograms per cubic metre (µg/m3) in the level of PM2.5 for a year or more had a 10% higher risk of getting depression.

      Level of air impact

    6. The data analysed in the new research linked depression with air pollution particles smaller than 2.5 micrometres (equivalent to 0.0025 millimetres and known as PM2.5)

      Beware of particles smaller than 2.5 micrometres

    7. Other research indicates that air pollution causes a “huge” reduction in intelligence and is linked to dementia. A comprehensive global review earlier in 2019 concluded that air pollution may be damaging every organ and virtually every cell in the human body.

      Air pollution impacts more than just our mental health

    8. the finest particulates from dirty air can reach the brain via both the bloodstream and the nose, and that air pollution has been implicated in increased [brain] inflammation, damage to nerve cells and to changes in stress hormone production, which have been linked to poor mental health,” Braithwaite said

      How air impacts our mental health

    9. “You could prevent about 15% of depression, assuming there is a causal relationship. It would be a very large impact, because depression is a very common disease and is increasing.” More than 264 million people have depression, according to the WHO.

      Increase of depression rate

    10. Depression and suicide linked to air pollution in new global study Cutting toxic air might prevent millions of people getting depression, research suggests

      Depression linked to air quality

    1. Don't focus too much on the salary. It's just one tiny part of the whole package.Your dev job pays your rent, food and savings. I assume that most dev jobs do this quite well.Beyond this, the main goal of a job is to increase your future market value, your professional network and to have fun. So. basically it's about how much you are worth in your next job and that you enjoy your time.A high salary doesn't help you if you do stuff which doesn't matter in a few years.

      Don't focus on the salary in your dev job.

    1. x = [[0, 1, 2, 3, 4], [5, 6, 7, 8, 9]] arr = [i for j in x for i in j] print(arr) >>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

      Flattening a multi-dimensional matrix into a 1-D array using list comprehension:

      x = [[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]]
      arr = [i for j in x for i in j]
      >>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    2. # Creating a 5x5 matrix arr = [[i for i in range(5)] for j in range(5)] arr >>> [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

      Nested for loop using list comprehension to come up with 5x5 matrix:

      arr = [[i for i in range(5)] for j in range(5)]
      >>> [[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]]
    3. arr = [i for i in range(10) if i % 2 == 0] print(arr) >>> [0, 2, 4, 6, 8] arr = ["Even" if i % 2 == 0 else "Odd" for i in range(10)] print(arr) >>> ['Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd']

      2 examples of conditional statements in list comprehension:

      arr = [i for i in range(10) if i % 2 == 0]
      >>> [0, 2, 4, 6, 8]


      arr = ["Even" if i % 2 == 0 else "Odd" for i in range(10)]
      >>> ['Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even', 'Odd']
    4. x = [2,45,21,45] y = {i:v for i,v in enumerate(x)} print(y) >>> {0: 2, 1: 45, 2: 21, 3: 45}

      List comprehension in Python to create a simple dictionary:

      x = [2,45,21,45]
      y = {i:v for i,v in enumerate(x)}
      >>> {0: 2, 1: 45, 2: 21, 3: 45}
    1. afternoons are spent reading/researching/online classes.This has really helped me avoid burn out. I go into the weekend less exhausted and more motivated to return on Monday and implement new stuff. It has also helped generate some inspiration for weekend/personal projects.

      Learning at work as solution to burn out and inspiration for personal projects

    1. Nothing truly novel, nothing that matters, is ever learned with ease

      If you don't struggle you don't learn, right?

    2. We want to learn, but we worry that we might not like what we learn. Or that learning will cost us too much. Or that we will have to give up cherished ideas.

      I believe it is normal to worry about the usage of a new domain-based knowledge

    3. Talented people flock to employers that promise to invest in their development whether they will stay at the company or not.

      Cannot agree more on that

    4. Leaders in every sector seem to agree: Learning is an imperative, not a cliché. Without it, careers derail and companies fail.

      Don't stop learning

    1. Unit Testing
      • Unit tests take a piece of the product and test that piece in isolation.
      • Unit testing should focus on testing small units.
      • Units should be tested independently of other units. This is typically achieved by mocking the dependencies.
    2. End-to-End Testing
      • End-to-end testing is a technique used to test whether the entire application flow behaves as expected from start to finish.
      • Tests that simulate real user scenarios can easily help to determine how a failing test would impact the user.
    3. Integration Testing
      • Integration testing is when we integrate two or more units.
      • An integration test checks their behavior as a whole, to verify that they work together coherently.
    4. There are three different methods of Automated Testing
      • Unit
      • Integration
      • Ent-to-End
    1. I also recently took about 10 months off of work, specifically to focus on learning. It was incredible, and I don’t regret it financially. I would often get up at 6 in the morning or even earlier (which I never do) just from excitement about what I was going to learn about and accomplish in the day. Spending my time focused Only on what I was most interested in was incredibly rewarding.

      Approach of taking 10 months off from work just to learn something new

    1. The major selling point of Julia these days is in crafting differentiable algorithms (data-driven code that neural networks use in machine learning) in Flux (machine learning library for Julia)

      Main selling point of Julia these days

    2. The reason that Julia is fast (ten to 30 times faster than Python) is because it is compiled and not interpreted

      Julia seems to be even faster than Scala when comparing to the speed of Python

    3. Scala is ten times faster than Python

      Interesting estimation

    1. If you are interested in exploring the dataset used in this article, it can be used straight from S3 with Vaex. See the full Jupyter notebook to find out how to do this.

      Example of EDA in Vaex ---> Jupyter Notebook

    2. Why is it so fast? When you open a memory mapped file with Vaex, there is actually no data reading going on. Vaex only reads the file metadata

      Vaex only reads the file metadata:

      • location of the data on disk
      • data structure (number of rows, columns...)
      • file description
      • and so on...
    3. displaying a Vaex DataFrame or column requires only the first and last 5 rows to be read from disk

      Vaex tries to go over the entire dataset with as few passes as possible

    4. virtual columns. These columns just house the mathematical expressions, and are evaluated only when required

      Virtual columns

    5. When filtering a Vaex DataFrame no copies of the data are made. Instead only a reference to the original object is created, on which a binary mask is applied

      Filtering Vaex DataFrame works on reference to the original data, saving lots of RAM

    6. Vaex supports Just-In-Time compilation via Numba (using LLVM) or Pythran (acceleration via C++), giving better performance. If you happen to have a NVIDIA graphics card, you can use CUDA via the jit_cuda method to get even faster performance.

      Tools supported by Vaex

    7. The describe method nicely illustrates the power and efficiency of Vaex: all of these statistics were computed in under 3 minutes on my MacBook Pro (15", 2018, 2.6GHz Intel Core i7, 32GB RAM). Other libraries or methods would require either distributed computing or a cloud instance with over 100GB to preform the same computations.

      Possibilities of Vaex

    8. Vaex is an open-source DataFrame library which enables the visualisation, exploration, analysis and even machine learning on tabular datasets that are as large as your hard-drive. To do this, Vaex employs concepts such as memory mapping, efficient out-of-core algorithms and lazy evaluations.

      Vaex - library to manage as large datasets as your HDD, thanks to:

      • memory mapping
      • efficient out-of-core algorithms
      • lazy evaluations.

      All wrapped in a Pandas-like API

    9. The first step is to convert the data into a memory mappable file format, such as Apache Arrow, Apache Parquet, or HDF5

      Before opening data with Vaex, we need to convert it into a memory mappable file format (e.g. Apache Arrow, Apache Parquet or HDF5). This way, 100 GB data can be load in Vaex in 0.052 seconds!

      Example of converting CSV ---> HDF5.

    10. AWS offers instances with Terabytes of RAM. In this case you still have to manage cloud data buckets, wait for data transfer from bucket to instance every time the instance starts, handle compliance issues that come with putting data on the cloud, and deal with all the inconvenience that come with working on a remote machine. Not to mention the costs, which although start low, tend to pile up as time goes on.

      AWS as a solution to analyse data too big for RAM (like 30-50 GB range). In this case, it's still uncomfortable:

      • managing cloud data buckets
      • waiting for data transfer from bucket to instance every time the instance starts
      • handling compliance issues coming by putting data on the cloud
      • dealing with remote machines
      • costs
    1. The Cornell Note-taking System

      The Cornell Note-taking System reassembling the combination of active learning and spaced repetition, just as Anki

    1. notebook contains an actual running Python interpreter instance that you’re fully in control of. So Jupyter can provide auto-completions, parameter lists, and context-sensitive documentation based on the actual state of your code

      Notebook makes it easier to handle dynamic Python features

    2. Mathematica didn’t really help me build anything useful, because I couldn’t distribute my code or applications to colleagues (unless they spent thousands of dollars for a Mathematica license to use it), and I couldn’t easily create web applications for people to access from the browser. In addition, I found my Mathematica code would often end up much slower and more memory hungry than code I wrote in other languages.

      Disadvantages of Mathematica:

      • memory hungry, slow code
      • expensive code
      • non-distributable license
    3. a methodology that combines a programming language with a documentation language, thereby making programs more robust, more portable, more easily maintained, and arguably more fun to write than programs that are written only in a high-level language. The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer.

      Exploratory testing described by Donald Knuth

    4. In recent years we’ve also begun to see increasing interest in exploratory testing as an important part of the agile toolbox

      Waterfall software development ---> agile ---> exploratory testing

    5. In the 1990s, however, things started to change. Agile development became popular. People started to understand the reality that most software development is an iterative process
    6. When I began coding, around 30 years ago, waterfall software development was used nearly exclusively.
    7. Development Pros Cons

      Table comparing pros and cons of:

      • IDE/Editor
      • REPL/shell
      • Traditional notebooks (like Jupyter)
    8. The point of nbdev is to bring the key benefits of IDE/editor development into the notebook system, so you can work in notebooks without compromise for the entire lifecycle
    9. Exploratory programming is based on the observation that most of us spend most of our time as coders exploring and experimenting

      In exploratory programming, we:

      • experiment with a new API to understand how it works
      • explore the behavior of an algorithm that we're developing
      • debug our code through combination of inputs
    10. This kind of “exploring” is easiest when you develop on the prompt (or REPL), or using a notebook-oriented development system like Jupyter Notebooks

      It's easier to explore the code:

      • when you develop on the prompt (or REPL)
      • in notebook-oriented system like Jupyter

      but, it's not efficient to develop in them