CSW
What is this?
CSW
What is this?
does anyone use AMQP
Are you asking me?
This the specification of the Source object
I don't get this sentence.
"This is the specification of the Source object:" you mean?
Apache Airflow
Link to it.
DataFlows
Link to it.
WUI
What does this mean?
creaate
typo
processes
Operational system processes or another type of process?
on bare metal
What's "bare metal" in your definition?
communication between pods
And what's a "pod?"
Should you place descriptions of code inside code comments or in text (paragraphs or lists) outside of the sample code? Note that readers who copy-and-paste a snippet gather not only the code but also any embedded comments. So, put any descriptions that belong in the pasted code into the code comments. By contrast, when you must explain a lengthy or tricky concept, you should typically place the text before the sample program.
.
When your readers are very experienced with a technology, don't explain what the code is doing, explain why the code is doing it.
.
According to research by Sung and Mayer (2012), providing any graphics—good or bad—makes readers like the document more; however, only instructive graphics help readers learn.
.
Most readers appreciate at least a brief introduction under each heading to provide some context. Avoid placing a level three heading immediately after a level two heading, as in the following example
.
Conversely, don't make paragraphs too short. If your document contains plenty of one-sentence paragraphs, your organization is faulty.
.
Long paragraphs are visually intimidating. Very long paragraphs form a dreaded "wall of text" that readers ignore. Readers generally welcome paragraphs containing three to five sentences, but will avoid paragraphs containing more than about seven sentences.
.
Avoid putting too much text into a table cell. If a table cell holds more than two sentences, ask yourself whether that information belongs in some other format.
.
If the list item is a sentence, use sentence capitalization and punctuation. Otherwise, do not use sentence capitalization and punctuation.
.
Consider starting all items in a numbered list with an imperative verb.
.
Sentences that start with There is or There are marry a generic noun to a generic verb. Generic weddings bore readers.
.
Many technical writers believe that the verb is the most important part of a sentence. Pick the right verb and the rest of the sentence will take care of itself.
.
Most readers mentally convert passive voice to active voice.
.
passive verb = form of be + past participle verb
.
In an active voice sentence, an actor acts on a target. That is, an active voice sentence follows this formula: Active Voice Sentence = actor + verb + target A passive voice sentence reverses the formula. That is, a passive voice sentence typically follows the following formula: Passive Voice Sentence = target + verb + actor
.
Use either of the following tactics to disambiguate this and that: Replace this or that with the appropriate noun. Place a noun immediately after this or that.
.
As a rule of thumb, if more than five words separate your noun from your pronoun, consider repeating the noun instead of using the pronoun.
.
See http://mathesaurus.sf.net/ for another MATLAB®/NumPy cross-reference.
Why linking to another of the same? If the reason is to provide a reference, I would change the title of this section and make this explicit.
from numpy import *
Generally, this is not recommended. Why recommending it here? Just because they would look more similar to how Matlab users do? If so, I believe the best approach is to teach the proper and recommended way, and not allow them to get their results without complying with Python standards.
‘array’ or ‘matrix’? Which should I use?
If the answer is so simple and obvious, I wouldn't make a question out of it. Just introduce arrays, then. And then explain how they work when compared to Matlab's version.
As a footnote, you may say "What about numpy matrices?" and give a short context.
In MATLAB®, arrays have pass-by-value semantics, with a lazy copy-on-write scheme to prevent actually creating copies until they are actually needed. Slice operations copy parts of the array. In NumPy arrays have pass-by-reference semantics. Slice operations are views into an array.
The title of the table are "differences," but the first sentence here is about a similarity.
MATLAB® uses 1 (one) based indexing. The initial element of a sequence is found using a(1). See note INDEXING Python uses 0 (zero) based indexing. The initial element of a sequence is found using a[0].
The first sentence of both columns are unnecessary.
In MATLAB®, the basic data type is a multidimensional array of double precision floating point numbers. Most expressions take such arrays and return such arrays. Operations on the 2-D instances of these arrays are designed to act more or less like matrix operations in linear algebra. In NumPy the basic type is a multidimensional array. Operations on these arrays in all dimensionalities including 2D are element-wise operations. One needs to use specific functions for linear algebra (though for matrix multiplication, one can use the @ operator in python 3.5 and above).
Sentences are too long. The same information can be said in a different, but lighter, way.
Some Key Differences
This table could have column names.
MATLAB® and NumPy/SciPy have a lot in common. But there are many differences. NumPy and SciPy were created to do numerical and scientific computing in the most natural way with Python, not to be MATLAB® clones.
Weird way of putting it. Are they similar or not? I'm specially confused about the third sentence - I had to read it 3x to understand it.
NumPy supports a much greater variety of numerical types than Python does.
I expected to read more on how comparable types are different from Python. Put in different words: what's the difference between type() and dtype()?
Who Else Uses NumPy?
I expected to see logos here. Maybe links to all the repositories in GitHub importing the lib.
Beware: matplotlib also has a function to build histograms (called hist, as in Matlab) that differs from the one in NumPy.
How exactly? When to use each?
See linalg.py in numpy folder for more.
I can't get this. Where is this? If that's relevant, why not linking to it?
Linear Algebra¶ Work in progress. Basic linear algebra to be included here.
This should probably be in an issue tracker.
Anyway, could easily make its own documentation page.
rg
Up to this point, rg has not being imported.
The matrix product can be performed using the @ operator (in python >=3.5) or the dot function or method
What's the recommended way? Based on my experience, people will use np.dot more often. Also because it's easy to spot bugs where you convert a matrix multiplication from a math formula and just uses *.
elementwise
What's "elementwise"? What's different from other approaches? It's a great place to show a difference from Python lists - try to do the same with lists.
[20,30,40,50]
Improve spacing
Why numpy. when the others don't have it? Why not linking to docs?
it’s very simple
Many would say that is not simple. Generally, I'd be in favor of not suggesting that something is simple or easy at all.
The best and easiest way to do this is to use Pandas.
It would be interesting to add that Pandas is not part of NumPy. Why should I use another library?
You can save a NumPy array as a plain text file like a .csv or .txt file with np.savetxt.
This is a place for organizing the hierarchy of the section. How many recommended ways I have for saving an array in disk? What are the practical differences? When to use each?
::
Duplicated.
You can save it as “filename.npy” with: >>>>>> np.save('filename', a) You can use np.load() to reconstruct your array. >>>>>> b = np.load('filename.npy')
Weird that I use .npy just when loading. If the function supports saving with the extension in the name, I would add it in the example.
handle NumPy binary files with a .npy file extension, and a savez function that handles NumPy files with a .npz file extension.
What's the practical difference between the two extensions? What's the preferred/recommended way? Those are my first questions in this section.
You will, at some point, want to save your arrays to disk and load them back without having to re-run the code.
"re-run the code." What code? What does this do? Maybe suggest that this was an array that was generated after calling multiple functions, etc.
For example, this is the mean square error formula (a central formula used in supervised machine learning models that deal with regression): Implementing this formula is simple and straightforward in NumPy: What makes this work so well is that predictions and labels can contain one or a thousand values. They only need to be the same size.
Amazing use of images and colors!
When it comes to the data science ecosystem, Python and NumPy are built with the user in mind.
I'd remove this. What ecosystem isn't (at least wouldn't say so)?
This section covers help(), ?, ??
? and ?? feels that's something missing.
The primary difference between the two is that the new array created using ravel() is actually a reference to the parent array (i.e., a “view”). This means that any changes to the new array will affect the parent array as well.
I've never knew this! Every time I needed it, I would google "the correct way" because often would not behave as I needed.
How to reverse an array
I've never personally heard about this np.flip function. I'd consider leaving this section out, or tell why it's relevant not to use Python's reversed.
transpose your matrices
Considering the assumptions (of the reader) in the rest of this page, it would be nice to explain what transposing is.
generate random numbers (actually, repeatable pseudo-random numbers)
I believe that this information is obvious for someone who knows the difference between the two. For those who don't know, it might add confusion. I'd remove it or leave it for a different paragraph, with some optional context.
You
I'm slightly confused about the different colors and shades.
NumPy
Why two grades of purple?
Views are an important NumPy concept!
Let's bold this, then!
shallow copy
Why italic?
ou can also stack two existing arrays, both vertically and horizontally.
Another excellent place for visualizations. Not only a static image, but possibly a GIF.
slicing and indexing, np.vstack(), np.hstack(), np.hsplit(), .view(), copy()
Why just some use np.?
a%2==0
Add spacing
You can visualize it this way:
Amazing to see visualizations. I think they should be much more present.
This section covers ndarray.ndim, ndarray.size, ndarray.shape
Might be intentional, but now start thinking if these function names should link to their own documentation.
, axis=0
That's something that I've seen confusing many people and I still have to think for a moment what's the axis 0 and 1. It would be nice to add a sentence on that before using.
This section covers np.sort(), np.concatenate()
Amazing for preparing the reader for what's coming.
np.arange(2, 9, 2)
It would be nice to add the attribute names here, too.
Arrays and array operations are much more complicated than are captured here!
I think it's possible to say the same without scaring the person away.
2D
I believe it's ok to drop this.
You might occasionally hear an array referred to as a “ndarray,” which is shorthand for “N-dimensional array.” An N-dimensional array is simply an array with any number of dimensions. You might also hear 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on. The NumPy ndarray class is used to represent both matrices and vectors. A vector is an array with a single dimension (there’s no difference between row and column vectors), while a matrix refers to an array with two dimensions. For 3-D or higher dimensional arrays, the term tensor is also commonly used.
Again, another place for images.
“0”
*
`0`
One way we can initialize NumPy arrays is from Python lists, using nested lists for two- or higher-dimensional data.
I'd love to see a drawing here.
Why use NumPy?
It would be nice to add that, often, NumPy array can be used interchangeably with lists, but without the performance gains.
it’s very easy to understand
It's better to explain, not say how people are supposed to feel when learning this new thing.
We shorten
Who's "we"?
Installing NumPy
How to choose between Anaconda or pip? It seems that assumes that I know the difference.
find more information
*learn more
Need Discovery
I see that this nomenclature (e.g. needs summary, job epics, blueprint) is widely used in documents across the organization. At this at this moment, from my first day ~4 months ago to this first day of Tech Leaders Program, it's not clear the difference between these documents. I wonder if there are ways to make this more explicit to people who haven't deeply understand this dojo page yet.
NB
What does it mean?
The way I see it, it seems to be a role that not only is in the technical side, but in the product one. In this context: does a tech leader work with a product owner? If so, how's the interaction between them?
These are the Datopian Tech Leaders.
It would be interesting to have a directory of graduates here in this page. This way, people can refer to them in the future.
Needs Analysis
Personally, I read it as "(it) needs analysis" for a few weeks while reading such doc about an existing Datopian project. Silly mistake, I know.
need or want to requirements.
TODO: change to
"need" or "want" to "requirements."
Plan: planning work to deliver that solution. This includes breaking down a design into tasks, clarifying their dependencies and estimating these (i.e. a roadmap).
If the first steps are done by someone who's not going to be the developer, ideally, the estimation step should include the developers (e.g., engineers, designers).
We emphasize that analysis is applicable both to large projects and to a single simple task. Just as test-driven development is worthwhile even for simple changes, so analyis will pay dividends even for small script or a minor change to a website.
Reminds me of README-Driven Development, associated to Test-Driven Development.