113 Matching Annotations
  1. Jun 2021
    1. Li, X., Ostropolets, A., Makadia, R., Shoaibi, A., Rao, G., Sena, A. G., Martinez-Hernandez, E., Delmestri, A., Verhamme, K., Rijnbeek, P. R., Duarte-Salles, T., Suchard, M. A., Ryan, P. B., Hripcsak, G., & Prieto-Alhambra, D. (2021). Characterising the background incidence rates of adverse events of special interest for covid-19 vaccines in eight countries: Multinational network cohort study. BMJ, 373, n1435. https://doi.org/10.1136/bmj.n1435

    1. For example, Database Cleaner for a long time was a must-have add-on: we couldn’t use transactions to automatically rollback the database state, because each thread used its own connection; we had to use TRUNCATE ... or DELETE FROM ... for each table instead, which is much slower. We solved this problem by using a shared connection in all threads (via the TestProf extension). Rails 5.1 was released with a similar functionality out-of-the-box.
  2. May 2021
    1. Robert Colvile. (2021, February 16). The vaccine passports debate is a perfect illustration of my new working theory: That the most important part of modern government, and its most important limitation, is database management. Please stick with me on this—It’s much more interesting than it sounds. (1/?) [Tweet]. @rcolvile. https://twitter.com/rcolvile/status/1361673425140543490

  3. Mar 2021
  4. Feb 2021
  5. Jan 2021
    1. IANA Time Zone Database Main time zone database. This is where Moment TimeZone sources its data from.

      every place has a history of different Time Zones because of the geographical, economical, political, religious reasons .These rules are present in IANA Time Zone database. Also it contains rules for Daylight Saving Time (DST) . Checkout the map on this page: https://en.wikipedia.org/wiki/Daylight_saving_time

  6. Dec 2020
    1. Databases If databases data is stored on a ZFS filesystem, it’s better to create a separate dataset with several tweaks: zfs create -o recordsize=8K -o primarycache=metadata -o logbias=throughput -o mountpoint=/path/to/db_data rpool/db_data recordsize: match the typical RDBMSs page size (8 KiB) primarycache: disable ZFS data caching, as RDBMSs have their own logbias: essentially, disabled log-based writes, relying on the RDBMSs’ integrity measures (see detailed Oracle post)
  7. Nov 2020
    1. Interaction with stable storage in the modern world isgenerally mediated by systems that fall roughly into oneof two categories: a filesystem or a database. Databasesassume as much as they can about the structure of thedata they store. The type of any given piece of datais known (e.g., an integer, an identifier, text, etc.), andthe relationships between data are well defined. Thedatabase is the all-knowing and exclusive arbiter of ac-cess to data.Unfortunately, if the user of the data wants more di-rect control over the data, a database is ill-suited. At thesame time, it is unwieldy to interact directly with stablestorage, so something light-weight in between a databaseand raw storage is needed. Filesystems have traditionallyplayed this role. They present a simple container abstrac-tion for data (a file) that is opaque to the system, and theyallow a simple organizational structure for those contain-ers (a hierarchical directory structure)

      Databases and filesystems are both systems which mediate the interaction between user and stable storage.

      Often, the implicit aim of a database is to capture as much as they can about the structure of the data they store. The database is the all-knowing and exclusive arbiter of access to data.

      If a user wants direct access to the data, a database isn't the right choice, but interacting directly with stable storage is too involved.

      A Filesystem is a lightweight (container) abstraction in between a database and raw storage. Filesystems are opaque to the system (i.e. visible only to the user) and allow for a simple, hierarchical organizational structure of directories.

    1. I've spent the last 3.5 years building a platform for "information applications". The key observation which prompted this was that hierarchical file systems didn't work well for organising information within an organisation.However, hierarchy itself is still incredibly valuable. People think in terms of hierarchies - it's just that they think in terms of multiple hierarchies and an item will almost always belong in more than one place in those hierarchies.If you allow users to describe items in the way which makes sense to them, and then search and browse by any of the terms they've used, then you've eliminated almost all the frustrations of a file system. In my experience of working with people building complex information applications, you need: * deep hierarchy for classifying things * shallow hierarchy for noting relationships (eg "parent company") * multi-values for every single field * controlled values (in our case by linking to other items wherever possible) Unfortunately, none of this stuff is done well by existing database systems. Which was annoying, because I had to write an object store.

      Impressed by this comment. It foreshadows what Roam would become:

      • People think in terms of items belonging to multiple hierarchies
      • If you allow users to describe items in a way that makes sense to them and allow them to search and browse by any of the terms they've used, you've solved many of the problems of existing file systems

      What you need to build a complex information system is:

      • Deep hierarchies for classifying things (overlapping hierarchies should be possible)
      • Shallow hierarchies for noting relationships (Roam does this with a flat structure)
      • Multi-values for every single field
      • Controlled values (e.g. linking to other items when possible)
  8. Oct 2020
    1. Image a situation wherein you have just launched your app. But the data of your app is not being properly displayed or you are not able to fetch the data that is being entered by the users. What will be the impression of your app in the user’s mind?

      Many businesses get confused when it comes to choosing the right database for their application. In fact, it is quite crucial to choose the one between SQLite and Realm.

  9. Sep 2020
    1. The Realm is a new database module that is improving the way databases are used and also supports relationships between objects. If you are part of the SQL development world, then you must be familiar with the Realm.
    1. So Memex was first and foremost an extension of human memory and the associative movements that the mind makes through information: a mechanical analogue to an already mechanical model of memory. Bush transferred this idea into information management; Memex was distinct from traditional forms of indexing not so much in its mechanism or content, but in the way it organised information based on association. The design did not spring from the ether, however; the first Memex design incorporates the technical architecture of the Rapid Selector and the methodology of the Analyzer — the machines Bush was assembling at the time.

      How much further would Bush have gone if he had known about graph theory? He is describing a graph database with nodes and edges and a graphical model itself is the key to the memex.

  10. Aug 2020
    1. Allows batch updates by silencing notifications while the fn is running. Example: form.batch(() => { form.change('firstName', 'Erik') // listeners not notified form.change('lastName', 'Rasmussen') // listeners not notified }) // NOW all listeners notified
  11. Jul 2020
  12. Jun 2020
    1. Normalize the database for this case if your data is going to be modified multiple times
    2. normalizing our dabatase will help us. What means normalize? Well, it simply means to separate our information as much as we can

      directly contradicts firebase's official advice: denormalize the structure by duplicating some of the data: https://youtu.be/lW7DWV2jST0?t=378

    1. Denormalization is a database optimization technique in which we add redundant data to one or more tables
    1. Deadlocks are a classic problem in transactional databases, but they are not dangerous unless they are so frequent that you cannot run certain transactions at all. Normally, you must write your applications so that they are always prepared to re-issue a transaction if it gets rolled back because of a deadlock.
    1. transaction calls can be nested. By default, this makes all database statements in the nested transaction block become part of the parent transaction. For example, the following behavior may be surprising: User.transaction do User.create(username: 'Kotori') User.transaction do User.create(username: 'Nemu') raise ActiveRecord::Rollback end end creates both “Kotori” and “Nemu”. Reason is the ActiveRecord::Rollback exception in the nested block does not issue a ROLLBACK. Since these exceptions are captured in transaction blocks, the parent block does not see it and the real transaction is committed.

      How is this okay??

      When would it ever be the desired/intended behavior for a raise ActiveRecord::Rollback to have absolutely no effect? What good is the transaction then??

      What happened to the principle of least surprise?

      Is there any reason we shouldn't just always use requires_new: true?

      If, like they say, the inner transaction "become[s] part of the parent transaction", then if anything, it should roll back the parent transaction too — not roll back nothing.

    2. One workaround is to begin a transaction on each class whose models you alter:
  13. May 2020
    1. I think you should normalize if you feel that introducing update or insert anomalies can severely impact the accuracy or performance of your database application.  If not, then determine whether you can rely on the user to recognize and update the fields together. There are times when you’ll intentionally denormalize data.  If you need to present summarized or complied data to a user, and that data is very time consuming or resource intensive to create, it may make sense to maintain this data separately.

      When to normalize and when to denormalize. The key is to think about UX, in this case the factors are db integrity (don't create errors that annoy users) and speed (don't make users wait for what they want)

    2. Can database normalization be taken too far?  You bet!  There are times when it isn’t worth the time and effort to fully normalize a database.  In our example you could argue to keep the database in second normal form, that the CustomerCity to CustomerPostalCode dependency isn’t a deal breaker.

      Normalization has diminishing returns

    3. Now each column in the customer table is dependent on the primary key.  Also, the columns don’t rely on one another for values.  Their only dependency is on the primary key.

      Columns dependency on the primary key and no dependency on other columns is how you get 2NF and 3NF

    4. A table is in third normal form if: A table is in 2nd normal form. It contains only columns that are non-transitively dependent on the primary key

      3NF Definition

  14. Apr 2020
    1. From a narratological perspective, it would probably be fair to say that most databases are tragic. In their design, the configuration of their user interfaces, the selection of their contents, and the indexes that manage their workings, most databases are limited when set against the full scope of the field of information they seek to map and the knowledge of the people who created them. In creating a database, we fight against the constraints of the universe – the categories we use to sort out the world; the limitations of time and money and technology – and succumb to them.

      databases are tragic!

    1. columnar databases are well-suited for OLAP-like workloads (e.g., data warehouses) which typically involve highly complex queries over all data (possibly petabytes). However, some work must be done to write data into a columnar database. Transactions (INSERTs) must be separated into columns and compressed as they are stored, making it less suited for OLTP workloads. Row-oriented databases are well-suited for OLTP-like workloads which are more heavily loaded with interactive transactions. For example, retrieving all data from a single row is more efficient when that data is located in a single location (minimizing disk seeks), as in row-oriented architectures. However, column-oriented systems have been developed as hybrids capable of both OLTP and OLAP operations, with some of the OLTP constraints column-oriented systems face mediated using (amongst other qualities) in-memory data storage.[6] Column-oriented systems suitable for both OLAP and OLTP roles effectively reduce the total data footprint by removing the need for separate systems

      typical applications (adding new users data, or even retrieving user data) are better done in (standard) row-oriented DB. Typical analytics application, such as even simple AVG(whole column) are much slower because the elements of the same column are stored far away from each other in a traditional row-oriented DB, hence increasing disk-access time.

    2. seek time is incredibly long compared to the other bottlenecks in computers
    3. Operations that retrieve all the data for a given object (the entire row) are slower. A row-based system can retrieve the row in a single disk read, whereas numerous disk operations to collect data from multiple columns are required from a columnar database.
    1. Relational databases are designed around joins, and optimized to do them well. Unless you have a good reason not to use a normalized design, use a normalised design. jsonb and things like hstore are good for when you can't use a normalized data model, such as when the data model changes rapidly and is user defined. If you can model it relationally, model it relationally. If you can't, consider json etc.
    2. Joins are not expensive. Who said it to you? As basically the whole concept of relational databases revolve around joins (from a practical point of view), these product are very good at joining. The normal way of thinking is starting with properly normalized structures and going into fancy denormalizations and similar stuff when the performance really needs it on the reading side. JSON(B) and hstore (and EAV) are good for data with unknown structure.
  15. Mar 2020
    1. I chose all my scholarly journals, I put them together. I chose some YouTube videos; they were –IF:        Mm-hmm.CF:        – like, a bunch of TED talks.        

      Compiling research materials.

      Is there room for us to think about the iterative process; can we work with instructors to "reward" (or assign) students to alternate the searching, reading and writing.

    2. And – And I seen how – I saw how many, um, scholarly journals or how many sources came up for it, right? Um, number of sources. Right. And then, if I – if I felt like it wasn’t enough for me to thoroughly talk about the topic, I would move on. Right? So, when I did segregation, there – like, I guess, like, my specific topic was modern-day, so there wasn’t really much about it. Right? So, not much info. Right? And then, when I did gentrification, there were a lot, right?

      This part of the process is interesting to me. Links topic selection to search (seemingly a single search).

      It also seems a little misguided. What can we do in our lessons that could make tiny changes to this attitude?

  16. Oct 2019
  17. Sep 2019
    1. The problem with the annotation notion is that it's the first time that we consider a piece of data which is not merely a projection of data already present in the message store: it is out-of-band data that needs to be stored somewhere.

      could be same, schemaless datastore?

    2. many of the searches we want to do could be accomplished with a database that was nothing but a glorified set of hash tables

      Hello sql and cloure.set ns! ;P

    3. There are objects, sets of objects, and presentation tools. There is a presentation tool for each kind of object; and one for each kind of object set.

      very clojure-y mood, makes me think of clojure REBL (browser) which in turn is inspired by the smalltalk browser and was taken out of datomic (which is inspired by RDF, mentioned above!)

  18. May 2019
  19. Oct 2018
  20. Sep 2018
  21. Apr 2018
    1. The takeaway from the article: Choose document-oriented database only when the data can be treated as a self-contained document

  22. Dec 2017
  23. alleledb.gersteinlab.org alleledb.gersteinlab.org
    1. AlleleDB is a repository, providing genomic annotation of cis-regulatory single nucleotide variants (SNVs) associated with allele-specific binding (ASB) and expression (ASE).
  24. Nov 2017
    1. select top 1 * from newsletters where IsActive = 1 order by PublishDate desc

      This doesn't require a full table scan or a join operation. That's just COOL

    1. They have a very simplistic view of the activity being monitored by only distilling it down into only a few dimensions for the rule to interrogate

      Number of dimensions need to be large. In normal database systems these dimensions are small.

  25. Aug 2017
    1. Football Leaks, which consists of 1.9 terabytes of information and some 18.6 million documents, ranging from player contracts to emails revealing secret transfer fees and wages, is the largest leak in the history of sport.

      "Football Leaks, which consists of 1.9 terabytes of information and some 18.6 million documents, ranging from player contracts to emails revealing secret transfer fees and wages, is the largest leak in the history of sport."

      A pity this information is not available to the public.

      Given the limited release of documents, is it really the largest leak in the history of sport?

      The ICIJ offshore database may not be complete, but there is at least something, and it is searchable.

      Hopefully EIC will also follow this example.

  26. Jun 2017
    1. The vertices and edges of a graph, known as Atoms, are used to represent not only "data", but also "procedures"; thus, many graphs are executable programs as well as data structures.

      Rohan indicated that procedures are also part of the graph. let us find out why.

  27. May 2017
  28. Mar 2017
    1. Genome Sequence Archive (GSA)

      Database URL is here: http://gsa.big.ac.cn/

      Note: metadata is INSDC format, but this database isn't part of the INSDC, so you'll still need to submit your data to one of those databases to meet internationally recognised mandates