1. Last 7 days
    1. a writing system (known as Indus Script or Harappan Script) it has not yet been deciphered

      Will deciphering it ever be possible? Are there other ways of understanding a language other than translating from another known language?

    2. Vedic

      What are "Vedic" sources?

    1. for - climate change impacts - marine life - citizen-science - potential project - climate departure - ocean heating impacts - marine life - marine migration - migrating species face collapse - migration to escape warming oceans - population collapse

      main research findings - Study involved 146 species of temperate or subpolar fish and 2,572 time series - Extremely fast moving species (17km/year) showed large declines in population while - fish that did not shift showed negligible decline - Those on the northernmost edge experienced the largest declines - There is speculation that the fastest moving ones are the also the one's with the least evolutionary adaptations for new environments

    1. 34.2

      Error code description:

      Faulty data communication to the middle fan motor on floor units and the bottom fan motor on 10 tray table top units.

      Condition for error detection:

      BUS signal from middle fan motor on floor units and the bottom fan motor on 10 tray table top units is missing or is not transmitted for at least 5 seconds at a time.

      Error area:

      Data transfer cable, control electronics for middle fan motor on floor units and the bottom fan motor on 10 tray table top units

      Relevant causes/components:

      • eSTL has initialised
      • Electrical connection to components
      • Control electronics for middle fan motor
    2. Fan mo-tor4

      Fan motor bottom on floor units

    3. Fan mo-tor2

      Fan motor bottom for table top units.

      Fan motor middle for floor units

    4. Fan mo-tor1

      Always the top fan motor

    5. In case bus components are failing to communicate via the bus there are 34.x error messages. If more than one busmember is missing there is no address adding any more like known from the former SCC series, but each bus memberis displayed with its own error message instead.

      For SCC / SCC 5 Senses and SCCWE,

      If more than one bus member was faulty, Bus error messages were added up, for example; if both top and bottom motors were faulty and not communicating with the Bus Master, Service 34-1 and Service 34.2 were added to provide a Service 34.3 error message. (1+2=3).

      For the iCombi series each bus member fault is shown individually and in the above cercumstance both Service 34.1 and Service 34.2 would be shown as error codes.

    6. Netzfrequenz nicht erkannt

      Mains frequency not detected

    7. Falschanschluß-Erkennung Low-Limi

      False connection detection Low limit

    8. Falschanschluß-Erkennung High-Limit

      False connection detection - High Limit

    9. Störung kapazitive Taste

      Capactive button malfunction.

    10. X76: With gas units only: 230 V to be switched at K1 main contactor to supply A5 ignition box and gas blowermotors.

      230Vac output only on gas models which is used to supply the gas ignition controller but only when K1 contactor energisese after ESTL checks.

    11. X75: A13 pump control pcb or A18 with iC Pro XS

      240Vac supply to A13 pump board iCombi Pro and Classic

      240Vac supply to A18 pump board for iCombi XS units.

      The iCombi Pro XS does not have the new A13 as it still has the original hardware from index I (i.e. Drain valve, SC pump, Care pump, CleanJet pump). Therefore it needs a special pump adapter pcb A18 to be able to control the older hardware with the new A10 I/O board.

    12. X70: M5 cooling fan

      X70 M5 cooling fan output 12Vdc.

      With the exception of iCombiPro XS and CMP XS, the cooling fan M5 is controlled by the CPU and depends on temperatures on several pcbs and the fan motor/s.

      With all electric and gas units 6-1/1 to 10-2/1 M5 is located at the bottom right hand corner of the electrical compartment directly above the air filter.

      With all electric and gas floor units M5 is located at the bottom right hand corner of the electrical compartment directly above the air filter.

      iCombi Pro XS. The cooling fan M5 is controlled by the temperature of B10 thermocouple in the electrical compartment; B10 is located behind the inverter of M1 fan motor and connected to A10, X6, Pins 3, 4.

      With iCombi Pro XS units M5 is located at the bottom right hand corner of the electrical compartment directly above the air filter.

    13. X59: LED light (6,8 V)

      X59 6.8V Power supply to Door LED arrays

    14. X51: 12 pol connector: 5 V stand-by voltage to A11, 12 V to A11 if unit is on and bus signal from A11

      X51 is a 12 pole female connector from which a 12 core cable carries data to and from all Bus members.

      The 12 wire cable also caries a 5V "stand by" voltage from I/O board A10 to CPU (A11 iCombi Pro) and the ICP capacitor switch pcb (A19).

      The "stand by" voltage enables operation of the capacitor on/off switch and the CPU as soon as power is supplied to the oven (as the isolator is switched on).

      When the capacitor switch is operated, ESTL completes its safety checks and providing no high temperature errors or component errors exist, contactor K1 energises and a 12 V supply is also sent to the CPU via this connector and its associated cable.

    15. X52: 6 pol connector: bus to A13 pump control pcb

      Data Highway for communications between CPU and all Bus members.

    1. 当然,在实际使用中,更多的情况是String -> Date的转换case,怎么破?有两个办法:回味本系列前面文章,因为前面有讲了不止一次关注后面文章。因为此case过于常见,后面(特别是在Spring MVC下使用)依旧会重点提及
  2. ivanov-petrov.livejournal.com ivanov-petrov.livejournal.com
    1. «Когда жизнь складывается наперекор нашим желаниям, мир вокруг подобен лечебным иглам и целебным снадобьям: он незаметно нас врачует.Когда мы не встречаем сопротивления, мир вокруг подобен наточенным топорам и острым пикам: он исподволь ранит и убивает»
    1. colloquium

      an academic conference or seminar

    2. hand-wringing

      the excessive display of concern or distress

    3. mayhem

      violent or extreme disorder; chaos: complete mayhem broke out | thousands of supporters tried to force their way into the stadium, causing mayhem.

    4. hellbent

      determined to achieve something at all costs: she's hell-bent on leaving.

    1. 1:01:00 Rian Doris was into quantified self in his late teens

      Also see other episode where Connor Murphy shared that same interest

    1. Webpage Snapshots Zotero already saves webpage snapshots on news articles and other pages, and those now open automatically in the new reader as well, enabling you to annotate webpages as easily as PDFs.

      Is it going to be possible to annotate with Zotero for web?

    1. Interesting perspective as interpretation where Jack says at the moment the song says "Who made up words, who made up numbers? Who wrote the Bible, who wrote the Q'uran" it might not even be a call to reflect and think for yourself (although this is absolutely a recurring theme in the song) but maybe they are implying all the science traces back not to the West (Europe) but to the East (Egypt, Africa). This interpretation aligns with the album this song was produced in, which is about Africa.

    2. Patience reaction video

    1. 54:50 "getting things done" is used in productivity vocabulary, not necessarily tied to the methodology "GTD". It signifies to produce and do stuff, which seemingly falls well on the tongue?

    1. The song's criticism on mass media is mainly related to sensationalism.

      "Good" things are usually not sensational. They do not demand attention, hence why the code of known/unknown based on selectors for attention filters it out.

      Reference Hans-Georg Moeller's explanations of Luhmann's mass media theory based on functionally differentiated systems theory.

      Can also compare to Simone Weil's thoughts on collectives and opinion; organizations (thus most part of mass media) should not be allowed to form opinions as this is an act of the intellect, only residing in the individual. Opinion of any form meant to spread lies or parts of the truth rather than the whole truth should be disallowed according to her because truth is a foundational, even the most sacred, need for the soul.

      People must be protected against misinformation.

    2. Patience reaction vid

    1. On a general level, the song is not just about criticizing society, but also about stimulating independence... and not just in thought and identity, but in everything.

      Don't be dependent on external factors.

    2. Unrelated to the song itself. It is interesting that different people interpret the song's meaning differently. Likely due to individual differences in perspective, history, culture, etc.

      Makes me reflect. Is knowledge/wisdom contained solely in content and words? Or is knowledge/wisdom rather contained in the RELATIONSHIP, the INTERACTION, between past experience, previous knowledge (identity) and substance?

      Currently I am inclined to go for the latter.

    3. The idea of growing wiser vs. growing tall is likely not meant for the individual but for society as a whole or the world at large. The full context of the song. But it might have double meaning and refer to both individual and society.

      Reminds me of Taleb's concept of Epistemic Arrogance (overvaluing that which we know)

    4. Songwriters don't criticize keeping zoo animals. They criticize prioritizing the zoo animals over the youth/humans (take in the full context bro)... Prioritize money over humanity.

    5. Reaction vid to patience

    1. Sunwest Silver is the largest supplier and manufacturer of charms and findings in the USA. Our silver charms are created by our own artisans and are original designs. If you are doing your own production and manufacturing, Sunwest Silver provides the industry with a state of the art casting facility, ready to create your unique designs, made right here in the U.S.A. Please contact us with inquiries.

      Established in 1970, and located in the heart of Albuquerque, N.M., Sunwest Silver Co., Inc. is your comprehensive industry source, providing for your needs from the mines up.

    1. Editors Assessment:

      RAD-Seq (Restriction-site-associated DNA sequencing) is a cost-effective method for single nucleotide polymorphism (SNP) discovery and genotyping. In this study the authors performed a kinship analysis and pedigree reconstruction for two different cattle breeds (Angus and Xiangxi yellow cattle). A total of 975 cattle, including 923 offspring with 24 known sires and 28 known dams, were sampled and subjected to SNP discovery and genotyping using RAD-Seq. Producing a SNP panel with 7305 SNPs capturing the maximum difference between paternal and maternal genome information, and being able to distinguish between the F1 and F2 generation with 90% accuracy. Peer review helped highlight better the practical applications of this work. The combination of the efficiency of RNA-seq and advances in kinship analysis here can helpfully help improve breed management, local resource utilization, and conservation of livestock.

      This evaluation refers to version 1 of the preprint

    2. AbstractKinship and pedigree information, used for estimating inbreeding, heritability, selection, and gene flow, is useful for breeding and animal conservation. However, as the size of the crossbred population increases, inaccurate generation and parentage recoding in livestock farms increases. Restriction-site-associated DNA sequencing (RAD-Seq) is a cost-effective platform for single nucleotide polymorphism (SNP) discovery and genotyping. Here, we performed a kinship analysis and pedigree reconstruction for Angus and Xiangxi yellow cattle, which benefit from good meat quality and yields, providing a basis for livestock management. A total of 975 cattle, including 923 offspring with 24 known sires and 28 known dams, were sampled and subjected to SNP discovery and genotyping. The identified SNPs panel included 7305 SNPs capturing the maximum difference between paternal and maternal genome information allowing us to distinguish between the F1 and F2 generation with 90% accuracy. In addition, parentage assignment software based on different strategies verified that the cross-assignments. In conclusion, we provided a low-cost and efficient SNP panel for kinship analyses and the improvement of local genetic resources, which are valuable for breed improvement, local resource utilization, and conservation.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.131), and has published the reviews under the same license. These are as follows.

      Reviewer 1. Liyun wan

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      The detailed parameters for the SNP and InDel calling should be described to allow reproduction.

      Additional Comments:

      This research provides valuable insights into the use of RAD-Seq to kinship analysis and pedigree reconstruction, which is useful for breeding and animal conservation purposes. Overall, the study is well-conducted and the findings are relevant. However, there are a few aspects that require attention before the manuscript can be considered for publication. Please address the following points: 1. Provide practical applications: Highlight the practical applications of your research in livestock management, breed improvement, local resource utilization, and conservation. Discuss how the low-cost and efficient SNP panel can contribute to these areas and provide suggestions for further research or implementation. 2. Language and clarity: Review the manuscript for clarity, grammar, and sentence structure. Ensure that all key terms and concepts are defined and explained to facilitate understanding for a broad readership. Once these revisions have been made, I believe the manuscript will be much stronger and suitable for publication.

      Reviewer 2. Mohammad Bagher Zandi

      Is the language of sufficient quality?

      Yes. It was great.

      Are all data available and do they match the descriptions in the paper?

      Yes. The raw sequencing reads were deposited but it would be better to share the the SNPs data as well.

      Is the data acquisition clear, complete and methodologically sound?

      No. SNPs detection and SNPs selection for assignment test is not clear.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      No. In some cases, the materials and methods section is vague. It is better to correct them. It is mentioned in the attached manuscript text.

      Additional Comments: Well done research, but the manuscript need some correction as commented on the attached file. See: https://gigabyte-review.rivervalleytechnologies.comdownload-api-file?ZmlsZV9wYXRoPXVwbG9hZHMvZ3gvRFIvNTA1L2dpZ2EtY29tZW50cy5kb2N4

    1. Blair, R J R. “Considering Anger from a Cognitive Neuroscience Perspective.” Wiley Interdisciplinary Reviews Cognitive Science 3, no. 1 (October 19, 2011): 65–74. https://doi.org/10.1002/wcs.154.

    1. Editors Assessment: This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong (see https://doi.org/10.46471/GIGABYTE_SERIES_0006). This example assembles the genome of the black-faced spoonbill (Platalea minor), an emblematic wading bird from East Asia that is classified as globally endangered by the IUCN. This Data Release reporting a 1.24Gb chromosomal-level genome assembly produced using a combination of PacBio SMRT and Omni-C scaffolding technologies. BUSCO and Merqury validation were carried out, gene models created, and peer reviewers also requested MCscan synteny analysis. This showed the genome assembly had high sequence continuity with scaffold length N50=53 Mb. Presenting data from 14 individuals this will hopefully be a useful and valuable resources for future population genomic studies aimed at better understanding spoonbill species numbers and conservation.

      *This evaluation refers to version 1 of the preprint *

    2. AbstractPlatalea minor, the black-faced spoonbill (Threskiornithidae) is a wading bird that is confined to coastal areas in East Asia. Due to habitat destruction, it has been classified by The International Union for Conservation of Nature (IUCN) as globally endangered species. Nevertheless, the lack of its genomic resources hinders our understanding of their biology, diversity, as well as carrying out conservation measures based on genetic information or markers. Here, we report the first chromosomal-level genome assembly of P. minor using a combination of PacBio SMRT and Omni-C scaffolding technologies. The assembled genome (1.24 Gb) contains 95.33% of the sequences anchored to 31 pseudomolecules. The genome assembly also has high sequence continuity with scaffold length N50 = 53 Mb. A total of 18,780 protein-coding genes were predicted, and high BUSCO score completeness (93.7% of BUSCO metazoa_odb10 genes) was also revealed. A total of 6,155,417 bi-allelic SNPs were also revealed from 13 P. minor individuals, accounting for ∼5% of the genome. The resource generated in this study offers the new opportunity for studying the black-faced spoonbill, as well as carrying out conservation measures of this ecologically important spoonbill species.

      This work is part of a series of papers presenting outputs of the Hong Kong Biodiversity Genomics https://doi.org/10.46471/GIGABYTE_SERIES_0006 This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.130), and has published the reviews under the same license. These are as follows.

      Reviewer 1. Richard Flamio Jr.

      Is the language of sufficient quality?

      No. There are some grammatical errors and spelling mistakes throughout the text.

      Is there sufficient detail in the methods and data-processing steps to allow reproduction?

      Yes. The authors did a phenomenal job at detailing the methods and data-processing steps.

      Additional Comments:

      Very nice job on the paper. The methods are sound and the statistics regarding the genome assembly are thorough. My only two comments are: 1) I think the paper could be improved by the correction of grammatical errors, and 2) I am interested in a discussion about the number of chromosomes expected for this species (or an estimate) based on related species and if the authors believe all of the chromosomes were identified. For example, is the karyotype known or can the researchers making any inferences about the number of microchromosomes in the assembly? Please see a recent paper I wrote on microchromosomes in the wood stork assembly (https://doi.org/10.1093/jhered/esad077) for some ideas in defining the chromosome architecture of the spoonbill and/or comparing this architecture to related species.

      Re-review:

      The authors incorporated the revisions nicely and have produced a quality manuscript. Well done.

      Minor revisions Line 46: A comma is needed after (Threskiornithidae). Line 47: “The” should not be capitalized. Line 48: This should read “as a globally endangered species.” Line 49: “However, the lack of genomic resources for the species hinders the understanding of its biology…” Line 56: Consider changing “also revealed” to “identified” to avoid repetition from the previous sentence. Line 65: Insert “the” before “bird’s.” Lines 69-70: Move “locally” higher in the sentence – “and it is protected locally…” Line 72: Replace “as of to date” with “prior to this study”. Lines 78-79: Pluralize “part.” Line 86: Replace “proceeded” with “processed.” Line 133: “…are listed in Table 1.” Line 158: “accounted” Line 159: “Variant calling was performed using…” Line 161: “Hard filtering was employed…” Lines 200-201: “The heterozygosity levels… from five individuals were comparable to previous reports on spoonbills – black-faced spoonbill … and royal spoonbill … (Li et al. 2022).” Line 202: New sentence. “The remaining heterozygosity levels observed…” Line 206: “…genetic bottleneck in the black-faced spoonbill…” Lines 208-209: “These results highlight the need…” Lines 213-214: “…which are useful and precious resources for future population genomic studies aimed at better understanding spoonbill species numbers and conservation.” Line 226: Missing a period after “heterozygosity.” For references, consider adding DOIs. Some citations have them but most citations would benefit from this addition.

      Reviewer 2. Phred Benham

      Is the language of sufficient quality?

      Generally yes, the language is sufficiently clear. However, a number of places could be refined and extra words removed.

      Are all data available and do they match the descriptions in the paper?

      Additional data is available on figshare.

      I do not see any of the tables that are cited in the manuscript and contain legends. Am I missing something. Also there is no legend for the GenomeScope profile in figure 3.

      The assembly appears to be on genbank as a scaffold level assembly, can you list this accession info in the data availability section in addition to the project number.

      Is there sufficient data validation and statistical analyses of data quality?

      Overall fine, but some additional analyses would aid the paper. Comparison of the spoonbill genome to other close relatives using a synteny plot would be helpful.

      It would also be useful to put heterozygosity and inbreeding coefficients into context by comparing to results from other species.

      Additional Comments:

      Hui et al. report a chromosome level genome for the black-faced spoonbill, a endangered species of coastal wetlands in East Asia. This genome will serve as an important genome for understanding the biology of and conserving this species.

      Generally, the methods are sound and appropriate for the generation of genomic sequence.

      Major comments: This is a highly contiguous genome in line with metrics for Vertebrate Genomics Project genomes and other consortia. The authors argue that they have assembled 31 Pseudo-molecules or chromosomes. It would be nice to see a plot showing synteny of these 31 chromosomes and a closely related species with a chromosome level assembly (e.g. Theristicus caerulescens; GCA_020745775.1)

      The tables appear to be missing from the submitted manuscript?

      Minor comments: Line 49: delete its

      Line 49-51: This sentence is a little awkward, please revise.

      Line 64: delete 'the'

      Line 67: replace 'with' with 'the spoonbil as a'

      Line 68: delete 'Interestingly'

      Line 70: can you be more specific about what kind of genetic methods had previously been performed?

      Line 79: can you provide any additional details on the necessary permits and/or institutional approval

      Line 78: what kind of tissue? or were these blood samples?

      Line 110: do you mean movies?

      Line 143: replace data with dataset

      Line 163: it may be worth applying some additional filters in vcftools, e.g. minor allele freq., min depth, max depth, what level of missing data was allowed?, etc.

      Line 171: delete 'resulted in'

      Line 172: do you mean scaffold L50 was 8? Line 191-195: some context would be useful here, how does this level of heterozygosity and inbreeding compare to other waterbirds?

      Line 217: why did you use the Metazoan database and not the Aves_odb10 database for Busco?

      Figure 1b: Number refers to what, scaffolds? Be consistent with capitalization for Mb. It seems like the order of scaffold N50 and L50 were reversed.

      Figure 3 is missing a legend. Hui et al. report a chromosome level genome for the black-faced spoonbill, a endangered species of coastal wetlands in East Asia. This genome will serve as an important genome for understanding the biology of and conserving this species.

      Generally, the methods are sound and appropriate for the generation of genomic sequence.

      Major comments: This is a highly contiguous genome in line with metrics for Vertebrate Genomics Project genomes and other consortia. The authors argue that they have assembled 31 Pseudo-molecules or chromosomes. It would be nice to see a plot showing synteny of these 31 chromosomes and a closely related species with a chromosome level assembly (e.g. Theristicus caerulescens; GCA_020745775.1)

      The tables appear to be missing from the submitted manuscript?

      Minor comments: Line 49: delete its

      Line 49-51: This sentence is a little awkward, please revise.

      Line 64: delete 'the'

      Line 67: replace 'with' with 'the spoonbil as a'

      Line 68: delete 'Interestingly'

      Line 70: can you be more specific about what kind of genetic methods had previously been performed?

      Line 79: can you provide any additional details on the necessary permits and/or institutional approval

      Line 78: what kind of tissue? or were these blood samples?

      Line 110: do you mean movies?

      Line 143: replace data with dataset

      Line 163: it may be worth applying some additional filters in vcftools, e.g. minor allele freq., min depth, max depth, what level of missing data was allowed?, etc.

      Line 171: delete 'resulted in'

      Line 172: do you mean scaffold L50 was 8? Line 191-195: some context would be useful here, how does this level of heterozygosity and inbreeding compare to other waterbirds?

      Line 217: why did you use the Metazoan database and not the Aves_odb10 database for Busco?

      Figure 1b: Number refers to what, scaffolds? Be consistent with capitalization for Mb. It seems like the order of scaffold N50 and L50 were reversed.

      Figure 3 is missing a legend. Re-review:

      I previously reviewed this manuscript and overall the authors have done a nice job addressing all of my comments.

      I appreciate that the authors include the MCscan analysis that I suggested. However, the alignment of the P. minor assembly and annotations to other genomes suggests rampant mis-assembly or translocations. Birds have fairly high synteny and I would expect Pmin to look more similar to the comparison between T. caerulescens and M. americana in the MCscan plot. For instance, parts of the largest scaffold in the Pmin assembly map to multiple different chromosomes in the Tcae assembly. Similarly, the Z in Tcae maps to 11 different scaffolds in the Pmin assembly and there does not appear to be a single large scaffold in the Pmin assembly that corresponds to the Z chromosome.

      The genome seems to be otherwise of strong quality, so I urge the authors to double-check their MCscan synteny analysis. If this pattern remains, can you please add some comments about it to the end of the Data Validation and Quality Control section? I think other readers will also be surprised at the low levels of synteny apparent between the spoonbill and ibis assemblies.

    1. To date, a whopping two million digital documents have been annotated

      I wonder how many annotations have been scribbled in the margins of physical books.

      Alternatively, I wonder how many digital "annotations" exist only as fleeting thoughts captured by a notes app, removed from the context that created them. The ability to couple thoughts to context is why annotation is powerful.

    1. 160 samples

      If we won’t know whether 1 sample = 1 donation I would call that out explicitly.

    2. SARS

      Looks like we only got SARS reads in two samples. This is a good candidate for validating with BLAST if there's time.

    3. species

      Also noteworthy that we saw a lot of anelloviridae in the Cebria Mendoza RNA MGS paper.

      Maybe we have both RNA and DNA stages of anelloviridae get picked up, or maybe the RNA > cDNA sample prep just also got a bunch of anelloviridae DNA in it.

    4. e preparation/library preparation is not taken

      I would highlight/link the Cebria Mendoza paper here.

    5. Popular/Well-known Name

      This is a great addition. I think it would actually be better to use the popular names in the above RA plot as well.

      We may also have some data source internally at SB that has a mapping from NCBI taxonomic virus name to common name.

    6. exclude Anelloviridae

      Why exclude Anelloviridae?

      I think it would make sense to have a version of this plot with human reads removed as we discussed in the context of Cebria Mendoza paper.

    7. 2.2 Quality control metrics

      Can the two QC sections be combined or are they stage specific?

    8. 3.2 Total viral content

      I would make this a histogram. You can make two histograms - one for classified reads and one for all reads.

    9. 4.1 Overall relative abundance

      I still think making a histogram is the way to go for this plot

    10. p_reads_max

      What is p_reads_max?

    1. Air

      Your link to the right up here is broken

    2. Contents

      Your contents here has way too much stuff. Break it into bigger chunks and use subheadings

    3. On October 30, 1948, the Donora High School Football team played through a dense smog to complete the game with hundreds of fans in the audience, despite very poor visibility.

      Citation?

    4. (Jacobs, Burgess, Abbott)

      Ah, here it is. Given that you are pulling from the story for an entire paragraph, I'd lead with some reference to this source.

      "In their book/article published in XXXX, Jacobs, Burgess, and Abbott tell the tale of...."

    5. The quality of air we breathe has direct impacts on our health. We must understand the factors that contribute to poor air quality and how we individually and collectively contribute to these changes. Until we can visualize the impact we have on our atmosphere, we will continue behavior that negatively impacts the air around us.

      Also a short paragraph.

    6. This event, known as the Donora Smog of 1948, prompted the country into taking a closer look at the negative impacts of air pollution. Widespread debate surrounding the event led to the first legislation aimed at regulating the air quality within the United States, ushering in a new era of tracking, combatting, and reversing the ill effects of poor air quality.

      Two sentences isn't much of a paragraph.

    7. changes

      factors

    8. NumPy and Pandas

      the NumPy and Pandas libraries

    9. Matplotlib and Seaborn for visualization, and Time Series forecasting algorithms such as Prophet and SARIMAX.

      This is not a complete sentence.

    10. We will address data inconsistencies, missing values and ensure that data is in a tidy format.

      This is not a paragraph

    11. We may need to normalize or standardize data if necessary and create new features through aggregation to enhance the model’s performance.

      Also not a paragraph

    12. p

      capitalized?

    13. Here’s a breakdown of its components:

      Is this supposed to be above the bullet points? Either way, I think those bullet points need a better intro.

    14. Metrics to Evaluate Machine Model Performance

      Any section needs to be introduced by text.

    15. Akaike Information Criteria (AIC)

      I don't think a reader has any idea what this is initially, so this chapter heading is kinda meaningless.

    16. Technique/Metric Description Purpose/Formula Scenario: Cancer prediction

      I don't think this table is useful in this current location. As a table, it should be just used as a reference and put at the end of the document.It is a nice summary table, for sure, but it doesn't belong smack in the middle of your paper.

      As far as a reader knowing what you are referring to when you use one of these terms, some you can probably safely assume you can use without explanation, and others you should bake the explanation into your text when you introduce it.

    17. Machine Learning AQI Time Series

      Text should introduce every section.

    18. Used to measure of a statistical model, it quantifies:

      Not a complete sentence

    19. Data Explaination

      Why is this part of the ML AQI Time Series chapter? Or chapter/heading hierarchy is extremely confusing in general

    20. The Akaike Information Criterion (AIC) is a measure used to compare different statistical models. It helps in model selection by balancing the goodness of fit and the complexity of the model. Here’s how to interpret the AIC value:

      This feels more like how this section should be starting.

    21. The files were given daily on a county wide basis, separated into different files by year.

      So what did you collect?

    22. Indoors, high humidity can trap air, leading to the growth of mold and harmful bacteria.

      This feels outside the scope of what you are doing though correct?

    23. Air Quality Data:

      These sections are too small for their own sections. Just make them their own paragraphs.

      EDIT: Actually, some of the later ones are more reasonable. Think about how you can balance between them though. Can you add to some to make it more reasonable as a section? Or remove from others? Maybe bullet points with a bolded starting line would be more appropriate?

    24. calculated

      aggregated

    25. Carbon Monoxide

      Carbon Monoxide (CO)

    26. Only motorbus data was used, which may not be reflective of cities with other large methods of public transportation, such as the New York subway system.

      It also seems to leave out what I'd guess is probably easily the most significant transit factor: cars and trucks?

    27. is updated as of

      was last updated on

    28. relevant columns were selected and renamed, reducing the information being brought into our initial SQL database.

      Just selecting and renaming wouldn't reduce the information, unless you are trying to say that you didn't bring in anything else.

    29. and imported

      remove

    30. The first dimension table is the dates table, a serialized list of dates from January 1st, 2015 to December 31st, 2022.

      You should explain why you did this. Otherwise breaking it out into a table of essentially 1 data column seems pointless. I'm pretty sure I recall the reason why, and it is a decent reason, but that is not apparent here.

    31. as well as the population and population density

      that is not shown in your ERD

    32. Understanding the context of a specified line requires joining the table back to the fact table, and joining the location and date tables to that as well.

      Ok, but I'm pretty sure this totally undid any of the space saving measures you gained with putting dates in their own table. Because you are including a massive number of duplicate items in your main table. You could have just left them separate and still joined by truncating the date to a year and matching that + location

    33. Finally, constraints have been added to limit unusual or impossible data.

      Should probably describe these, since they aren't apparent in the ERD at all.

    34. Figure 1.

      Reference these properly in Quarto. (It will also make your life easier)

    35. ERD Diagram

      You need a much more comprehensive caption here.

    36. Exploratory Data Analysis

      EDA is what you do to narrow down what actual analysis you want or need to do to answer your question. It probably should not be shown here unless mandatory for understanding a later piece of analysis.

    37. Dataframe Shape The DataFrame contains 147039 rows and 44 columns.

      Wat? Why is this here in this form?

    38. Exploring Oregon State By filtering our Dataframe for Oregon state, our DataFrame contains 2922 rows.

      Yeah, that's not a section, nor should it be. Mistake with #?

    39. Features Engineering Date Column Preprocessing:

      This is a paper, not notes of what was done. You need to explain these and describe what was done. A flowchart might also be very useful.

    40. Sweetviz Data Report Done! Use 'show' commands to display/save.    [100%]   00:01 -> (00:00 left) {"model_id":"0e8836738d0b492e92ad430e32f1e8d7","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"} Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files. We have generated a complete statistical report confirming the quality of EDA steps.

      Again, no need for this to be here. It contributes nothing to your story. Or if it does, you need to do a MUCH better job of making that clear. You can mention and link it in an appendix if you want.

    41. - Set tsmode=True when creating the ProfileReport - Ensure our DataFrame is sorted or specify the sortby parameter - Time Series Feature Identification

      Mangled formatting.

    42. Advanced Exploratory Data Analysis

      Probably also shouldn't be here, though it depends on what you mean by this.

    43. - Histograms are replaced with line plots - Feature details include new autocorrelation and partial autocorrelation plots - Two additional warnings may appear: NON STATIONARY and SEASONAL

      Mangled formatting

    44. These methods allowed us to thoroughly evaluate key data quality aspects, including: Class balance in categorical variables Presence and distribution of missing values (NaN) Feature distributions and correlations Potential time-series characteristics

      Ok, but I haven't seen you talk about any of these yet. So what use then were they toward answering your overall question?

    45. Time Series Visualization: CO, Wind and AQI

      Why is this a chapter? How is it contributing? Like it might be useful information for your question, but a chapter all by itself?

    46. NO2 (nitrogen dioxide) is an important air pollutant. Here’s a concise overview of it: - Reddish-brown gas with a pungent odor - Part of a group of pollutants known as nitrogen oxides (NOx) SO2 (sulfur dioxide) is an important air pollutant. Here’s a concise overview of SO2 as a pollutant: Colorless gas with a sharp, pungent odor Highly soluble in water Ozone (O₃) as a pollutant is a complex topic, as it can be both beneficial and harmful depending on its location in the atmosphere. Here’s a concise overview of ozone as a ground-level pollutant: Colorless to pale blue gas with a distinctive smell Highly reactive molecule composed of three oxygen atoms

      Again, wasn't all of this covered in the background?

    47. CO pollutant refers to carbon monoxide, which is a colorless, odorless, and tasteless gas that can be harmful to human health and the environment.

      Should have already established this in your background.

    48. Primarily produced by incomplete combustion of carbon-containing fuels Major sources include vehicle exhaust, industrial processes, and some natural sources like volcanoes Slightly less dense than air Highly flammable

      mangled formatting I think

    49. <Figure size 1000x1800 with 0 Axes>

      Figure appears before reference and explanation in text.

      Also, figure isn't actually a figure and has no caption.

      Also, plot is WAY too big for writeup

    50. <Figure size 1500x2000 with 0 Axes>

      Same issues as above figure: - Not explained in text - Not an actual figure with caption and reference - Way too large for the format

    51. we must

      You must? That is the only possible approach?

    52. We finally completed the exploratory data analysis.

      And you seemingly concluded nothing from it? Why should a reader care about this?

    53. 147039

      No. You do not include raw tabulated output like this in a publication. The columns aren't even labeled, so a reader has no idea what they are looking at. If it is worth showing a reader, then you render it properly, make sure everything is labeled, insert it as a table with a caption and reference and discuss it in the text.

    54. Ultimately, we want to see which variables have the greatest impact on AQI

      The AQI is defined in terms of some of these correct? So those should probably not be included?

    55. First, missing data must be addressed.

      This wasn't addressed as any of your earlier pre-processing?

    56. date

      Now I have even less idea of what this is showing me

    57. Since AQI is the dependent variable being measured, all rows without AQI data are dropped. Certain cities have very little data and will be dropped out of necessity.

      Ok. How little is very little data? Why is it necessary?

    58. The data collected has separate information for the city of New York City. NYC is divided into five boroughs, each within its own county. These values are grouped and averaged out to make NYC have the same amount of datapoints as every other city.

      Are other suburbs of major cities not counted separately? It seems like this could be a tricky thing to be fair about. And counties kinda already split things in an unambiguous way?

    59. date state county city population density \

      Pretty sure this output should absolutely be removed.

    60. figure 24324

      I missed the other 24 thousand 300 somewhere....

      Also, the thing below is a table, and should be referenced and captioned as such.

    61. Kansas City 241

      I think the count is largely unnecessary to show here, but what is up with Kansas City? And why is it not discussed when it is seemingly the only take-away I get from this table?

    62. To perform a ML prediction algorithm, the predicted variable (AQI) must be discrete.

      That doesn't seem correct. You can do all manner of regression algorithms with machine learning. No need to make this into a classification problem unless your SPECIFIC algorithm requires it. In which case you should discuss why you are using that specific algorithm.

    63. The bins chosen are: 0-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100 101-150 151+

      How were these chosen?

    64. date state county city population density aqi temp pressure humidity ... pm100 pm25 num_busses revenue operating_expense passenger_trips operating_hours passenger_miles operating_miles aqi_discrete 0 2015-01-04 Arizona Maricopa Phoenix 4064275 1198.9 86.50 41.458333 972.860814 60.739583 ... 20.709822 20.505426 729.0 47024975.0 2.256208e+08 55497019.0 2228182.0 2.190928e+08 28371107.0 81-90 1 2015-01-04 California Los Angeles Los Angeles 11922389 3184.7 118.50 50.281250 1010.666650 58.177084 ... 24.954331 21.400000 2259.0 273158938.0 1.056348e+09 367104774.0 7938548.0 1.448619e+09 84041668.0 101-150 7 2015-01-04 District Of Columbia District of Columbia Washington 5116378 4235.7 45.25 41.149479 1015.388525 61.075000 ... 13.750000 11.088055 1394.0 149657899.0 6.453259e+08 139353079.0 4115200.0 4.293390e+08 39643319.0 41-50 17 2015-01-04 Massachusetts Suffolk Boston 4328315 5319.0 42.75 33.458332 1018.536500 58.234375 ... 6.000000 7.244791 800.0 96572664.0 4.080501e+08 122496729.0 2231562.0 3.162285e+08 22115804.0 41-50 18 2015-01-04 Michigan Wayne Detroit 3725908 1772.2 49.25 30.203125 994.260400 72.671876 ... 18.250000 9.394618 432.0 31303313.0 1.729056e+08 33078462.0 1225079.0 1.696881e+08 17705665.0 41-50 5 rows × 25 columns date state county city population density aqi temp pressure humidity ... pm100 pm25 num_busses revenue operating_expense passenger_trips operating_hours passenger_miles operating_miles aqi_discrete 0 2015-01-04 Arizona Maricopa Phoenix 4064275 1198.9 86.50 41.458333 972.860814 60.739583 ... 20.709822 20.505426 729.0 47024975.0 2.256208e+08 55497019.0 2228182.0 2.190928e+08 28371107.0 81-90 1 2015-01-04 California Los Angeles Los Angeles 11922389 3184.7 118.50 50.281250 1010.666650 58.177084 ... 24.954331 21.400000 2259.0 273158938.0 1.056348e+09 367104774.0 7938548.0 1.448619e+09 84041668.0 101-150 7 2015-01-04 District Of Columbia District of Columbia Washington 5116378 4235.7 45.25 41.149479 1015.388525 61.075000 ... 13.750000 11.088055 1394.0 149657899.0 6.453259e+08 139353079.0 4115200.0 4.293390e+08 39643319.0 41-50 17 2015-01-04 Massachusetts Suffolk Boston 4328315 5319.0 42.75 33.458332 1018.536500 58.234375 ... 6.000000 7.244791 800.0 96572664.0 4.080501e+08 122496729.0 2231562.0 3.162285e+08 22115804.0 41-50 18 2015-01-04 Michigan Wayne Detroit 3725908 1772.2 49.25 30.203125 994.260400 72.671876 ... 18.250000 9.394618 432.0 31303313.0 1.729056e+08 33078462.0 1225079.0 1.696881e+08 17705665.0 41-50 5 rows × 25 columns

      What even am I looking at here?

    65. Definition 1

      The interactivity of the below is neat, but you need to talk about it!

    66. The following tools are used: Train Test Split One Hot Encoder Transformer Pipeline Standard Scaler

      For what purposes?

    67. Feature selection is done on the data.

      How? And what are the raw results?

    68. Carbon Monoxide Nitrogen Dioxide Ozone PM10 PM2.5

      Aren't all of these literally part of the definition of AQI?

    69. K nearest neighbors Tree model Random Forest model Logistic Regression Naive Bayes

      This is essentially the equivalent of EDA in ML. A reader doesn't care about all of the attempts that didn't go as well unless something critical was shown in that case. Just move straight to the best and discuss what it implies.

    70. 'city_Los Angeles', 'city_Phoenix', 'city_Portland

      these these three?

    71. Definition 2

      I'm confused why these are just labeled at Definitions?

    72. AirQuality Confusion Matrix 1

      What model is the above even for?? How is a reader supposed to interpret this?

    73. Pipeline

      Not explain in the text, as far as I can understand.

    74. the model.

      WHICH?

    75. A randomized search is run with 100 iterations.

      Like actual just random values for these parameters each time?

    76. {'memory': None, 'steps': [('aqi_transformer', ColumnTransformer(transformers=[('categories', OneHotEncoder(handle_unknown='infrequent_if_exist', min_frequency=5, sparse_output=False), ['city']), ('scaled_air_quality', StandardScaler(), ['temp', 'humidity', 'co', 'no2', 'o3', 'pm100', 'pm25'])], verbose_feature_names_out=False)), ('RF_model', RandomForestClassifier())], 'verbose': False, 'aqi_transformer': ColumnTransformer(transformers=[('categories', OneHotEncoder(handle_unknown='infrequent_if_exist', min_frequency=5, sparse_output=False), ['city']), ('scaled_air_quality', StandardScaler(), ['temp', 'humidity', 'co', 'no2', 'o3', 'pm100', 'pm25'])], verbose_feature_names_out=False), 'RF_model': RandomForestClassifier(), 'aqi_transformer__n_jobs': None, 'aqi_transformer__remainder': 'drop', 'aqi_transformer__sparse_threshold': 0.3, 'aqi_transformer__transformer_weights': None, 'aqi_transformer__transformers': [('categories', OneHotEncoder(handle_unknown='infrequent_if_exist', min_frequency=5, sparse_output=False), ['city']), ('scaled_air_quality', StandardScaler(), ['temp', 'humidity', 'co', 'no2', 'o3', 'pm100', 'pm25'])], 'aqi_transformer__verbose': False, 'aqi_transformer__verbose_feature_names_out': False, 'aqi_transformer__categories': OneHotEncoder(handle_unknown='infrequent_if_exist', min_frequency=5, sparse_output=False), 'aqi_transformer__scaled_air_quality': StandardScaler(), 'aqi_transformer__categories__categories': 'auto', 'aqi_transformer__categories__drop': None, 'aqi_transformer__categories__dtype': numpy.float64, 'aqi_transformer__categories__feature_name_combiner': 'concat', 'aqi_transformer__categories__handle_unknown': 'infrequent_if_exist', 'aqi_transformer__categories__max_categories': None, 'aqi_transformer__categories__min_frequency': 5, 'aqi_transformer__categories__sparse_output': False, 'aqi_transformer__scaled_air_quality__copy': True, 'aqi_transformer__scaled_air_quality__with_mean': True, 'aqi_transformer__scaled_air_quality__with_std': True, 'RF_model__bootstrap': True, 'RF_model__ccp_alpha': 0.0, 'RF_model__class_weight': None, 'RF_model__criterion': 'gini', 'RF_model__max_depth': None, 'RF_model__max_features': 'sqrt', 'RF_model__max_leaf_nodes': None, 'RF_model__max_samples': None, 'RF_model__min_impurity_decrease': 0.0, 'RF_model__min_samples_leaf': 1, 'RF_model__min_samples_split': 2, 'RF_model__min_weight_fraction_leaf': 0.0, 'RF_model__monotonic_cst': None, 'RF_model__n_estimators': 100, 'RF_model__n_jobs': None, 'RF_model__oob_score': False, 'RF_model__random_state': None, 'RF_model__verbose': 0, 'RF_model__warm_start': False}

      Definitely don't show this!

    77. Definition 3   0.6312159709618875

      What does this mean?

    78. Figure

      It is a table, not a figure.

    79. 0.5265748745864021

      Comment on this if you are going to show it.

    80. Decomposing the Time Series With Additive Method

      Is this supposed to be a much more subheading?

    81. AirQuality Confusion Matrix 2

      Captions need to be more detailed and discuss what a reader should take away from an image.

    82. decomposed

      composed

    83. By the help of statsmodel package we can break the time series into its seasonal pattern and trends. This will helps us to understand the data clearly and will help us to make more sense of the data.

      Ok, so how did you go about doing that?

    84. three

      You literally JUST told me there were 2...

    85. There are

      The above image is an unlabeled figure with no caption that is discussed nowhere in the text (at least so far). All of those things are problematic.

    86. If you have an increasing trend, you still see roughly the same size peaks and troughs throughout the time series. This is often seen in indexed time series where the absolute value is growing but changes stay relative.

      But is that what you are seeing here? It is confusing if you are talking in the abstract or about your specific data.

      Also, why do this? What are your takeaways? These feels like it exists in isolation?

    87. attempts to compute the optimum values of hyperparameters.

      Say how it works!

    88. grid search method

      This isn't code, so it shouldn't be in monospace. Underline or italicize it if you want to set it apart, or put it in quotes.

    89. ARIMA(0, 0, 0)x(0, 0, 0, 12) - AIC:969.5419650946665 ARIMA(0, 0, 0)x(0, 0, 1, 12) - AIC:799.0140026908043 ARIMA(0, 0, 0)x(0, 1, 0, 12) - AIC:701.7072455506197 ARIMA(0, 0, 0)x(0, 1, 1, 12) - AIC:568.3211239351035 ARIMA(0, 0, 0)x(1, 0, 0, 12) - AIC:708.2727189545345 ARIMA(0, 0, 0)x(1, 0, 1, 12) - AIC:660.9171130206936 ARIMA(0, 0, 0)x(1, 1, 0, 12) - AIC:596.1563221105039 ARIMA(0, 0, 0)x(1, 1, 1, 12) - AIC:571.8620221843147 ARIMA(0, 0, 1)x(0, 0, 0, 12) - AIC:888.4893265461405 ARIMA(0, 0, 1)x(0, 0, 1, 12) - AIC:754.7451219152275 ARIMA(0, 0, 1)x(0, 1, 0, 12) - AIC:695.0468020327725 ARIMA(0, 0, 1)x(0, 1, 1, 12) - AIC:563.3526496700842 ARIMA(0, 0, 1)x(1, 0, 0, 12) - AIC:708.3487691701486 ARIMA(0, 0, 1)x(1, 0, 1, 12) - AIC:655.8968840891383 ARIMA(0, 0, 1)x(1, 1, 0, 12) - AIC:598.1490374699148 ARIMA(0, 0, 1)x(1, 1, 1, 12) - AIC:566.3367865157978 ARIMA(0, 1, 0)x(0, 0, 0, 12) - AIC:769.1876196189784 ARIMA(0, 1, 0)x(0, 0, 1, 12) - AIC:681.4253047727481 ARIMA(0, 1, 0)x(0, 1, 0, 12) - AIC:740.3973501203114 ARIMA(0, 1, 0)x(0, 1, 1, 12) - AIC:606.0067883430007 ARIMA(0, 1, 0)x(1, 0, 0, 12) - AIC:688.9276375883021 ARIMA(0, 1, 0)x(1, 0, 1, 12) - AIC:683.2372837276466 ARIMA(0, 1, 0)x(1, 1, 0, 12) - AIC:637.9760649104885 ARIMA(0, 1, 0)x(1, 1, 1, 12) - AIC:607.9989487123431 ARIMA(0, 1, 1)x(0, 0, 0, 12) - AIC:717.0512101206406 ARIMA(0, 1, 1)x(0, 0, 1, 12) - AIC:636.373429528529 ARIMA(0, 1, 1)x(0, 1, 0, 12) - AIC:692.512410906277 ARIMA(0, 1, 1)x(0, 1, 1, 12) - AIC:559.6920424480529 ARIMA(0, 1, 1)x(1, 0, 0, 12) - AIC:650.5293595230056 ARIMA(0, 1, 1)x(1, 0, 1, 12) - AIC:638.1908637932411 ARIMA(0, 1, 1)x(1, 1, 0, 12) - AIC:594.940391452659 ARIMA(0, 1, 1)x(1, 1, 1, 12) - AIC:562.5484300875305 ARIMA(1, 0, 0)x(0, 0, 0, 12) - AIC:775.150570595756 ARIMA(1, 0, 0)x(0, 0, 1, 12) - AIC:688.1982167211085 ARIMA(1, 0, 0)x(0, 1, 0, 12) - AIC:702.425519762607 ARIMA(1, 0, 0)x(0, 1, 1, 12) - AIC:570.1689904036024 ARIMA(1, 0, 0)x(1, 0, 0, 12) - AIC:688.2931195730088 ARIMA(1, 0, 0)x(1, 0, 1, 12) - AIC:662.6749372683774 ARIMA(1, 0, 0)x(1, 1, 0, 12) - AIC:590.7883988000217 ARIMA(1, 0, 0)x(1, 1, 1, 12) - AIC:573.825547011459 ARIMA(1, 0, 1)x(0, 0, 0, 12) - AIC:725.2611476282008 ARIMA(1, 0, 1)x(0, 0, 1, 12) - AIC:644.4595774810737 ARIMA(1, 0, 1)x(0, 1, 0, 12) - AIC:696.6355146715679 ARIMA(1, 0, 1)x(0, 1, 1, 12) - AIC:565.337721591011 ARIMA(1, 0, 1)x(1, 0, 0, 12) - AIC:651.3742765976529 ARIMA(1, 0, 1)x(1, 0, 1, 12) - AIC:657.7255114881699 ARIMA(1, 0, 1)x(1, 1, 0, 12) - AIC:592.7702867201957 ARIMA(1, 0, 1)x(1, 1, 1, 12) - AIC:567.3861300859227 ARIMA(1, 1, 0)x(0, 0, 0, 12) - AIC:750.4532664961456 ARIMA(1, 1, 0)x(0, 0, 1, 12) - AIC:665.693748389872 ARIMA(1, 1, 0)x(0, 1, 0, 12) - AIC:720.7807876037391 ARIMA(1, 1, 0)x(0, 1, 1, 12) - AIC:588.6301637485213 ARIMA(1, 1, 0)x(1, 0, 0, 12) - AIC:665.7141239363682 ARIMA(1, 1, 0)x(1, 0, 1, 12) - AIC:667.6890275833365 ARIMA(1, 1, 0)x(1, 1, 0, 12) - AIC:611.4437482645567 ARIMA(1, 1, 0)x(1, 1, 1, 12) - AIC:590.6185673644065 ARIMA(1, 1, 1)x(0, 0, 0, 12) - AIC:717.3211552781574 ARIMA(1, 1, 1)x(0, 0, 1, 12) - AIC:636.7110296932944 ARIMA(1, 1, 1)x(0, 1, 0, 12) - AIC:693.1696490581699 ARIMA(1, 1, 1)x(0, 1, 1, 12) - AIC:561.5301944999834 ARIMA(1, 1, 1)x(1, 0, 0, 12) - AIC:643.9735168529521 ARIMA(1, 1, 1)x(1, 0, 1, 12) - AIC:638.640931561371 ARIMA(1, 1, 1)x(1, 1, 0, 12) - AIC:588.5992832053371 ARIMA(1, 1, 1)x(1, 1, 1, 12) - AIC:564.5468753697722

      This should not be shown.

    90. Summary of SARIMAX Print the summary which includes AIC

      Why all these other headings? This is still part of the above?

    91. ============================================================================== coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------ ar.L1 0.0483 0.306 0.158 0.875 -0.551 0.648 ma.L1 -1.0000 924.523 -0.001 0.999 -1813.031 1811.031 ma.S.L12 -1.0000 2355.498 -0.000 1.000 -4617.692 4615.692 sigma2 134.1503 3.35e+05 0.000 1.000 -6.57e+05 6.57e+05 ==============================================================================

      Yup, that won't mean a thing to most readers (myself included) unless you explain it.

    92. How Fit the SARIMAX model

      There is no "how to" here at all.

    93. Plot Diag

      reference the figure correctly and explain it in the text!

      Also, give it an actual meaningful caption.

    94. Rigorous validation is paramount to establishing the model’s reliability and practical application. To ensure the model’s generalizability, we will employ a train-test split.

      Why is this just being mentioned after all of the ML stuff has happened?

    95. The AIC value is: 561.5301944999834

      Which tells us what?

    96. Start date of the data: 2015-01-31 00:00:00 End date of the data: 2022-12-31 00:00:00

      ??

    97. To facilitate c

      The above graphic is a figure. Treat it as such!

      Also, use dashed lines for one of the entries so that we can see that they are actually perfectly overlapping and not just take your word for it.

    98. The Mean Squared Error of our forecasts is 1.41

      units? What do you conclude from this?

    99. Forecasting Future Values As we conclude our modeling process, we generate predictions for the next 7 data points: Model Information: The result variable contains our fitted model’s details. Forecasting Method: We use the .get_forecast() method on our model results. Prediction Generation: This method analyzes observed patterns in our data to project future values. Output: We obtain forecasts for the next 7 time points, representing predicted air quality levels. This step transforms our analytical work into actionable insights for air quality management.

      I don't understand what you are trying to say or do here. You have already done some of this above (I think) so I'd guess this is a summary, except that some of this I'm pretty sure I haven't seen?

    100. Our plot

      Reference the figure number! And stick a caption on it!

      Is this plot for portland? That isn't apparent anywhere that I can see either.

    101. Interpreting the Forecast Plot

      unnecessary

    102. Represents the actual, historical air quality measurements Provides a baseline for comparing our predictions Forecasted Values (Orange Line) Depicts the future air quality levels predicted by our SARIMAX Time Series Model Allows us to visualize potential trends and patterns in air quality Confidence Interval (Shaded Region) The shaded area around the forecast line represents the 95% Confidence Interval (CI) Indicates the range within which we can be 95% confident that the true future values will fall Wider intervals suggest greater uncertainty in the prediction

      This is a publication. Use complete sentences.

    103. exasperated by the dry heat and lack of rainfall

      Is this the actual cause? You showed some seasonality, I'm not sure these causes were showcased.

    104. we have landed on these specific recommendations.

      Ok, let me just say that at this point, after reading through all your above analysis, I have NO IDEA what your recommendations are going to be. Which probably tells me that you did a poor job of actually showcasing your proof for each of these recommendations.

      I haven't read what they are yet, but for every recommendation you make, I should be able to go back to a specific section or figure and see the exact reason for why you would make that prediction. If that is not the case, then you are either making unfounded recommendations, or you are not communicating what your analysis was for clearly enough.

    105. As climate change raises temperatures and water sources dry up, wildfire season will continue to get worse over time.

      Agreed, how would you interpret your data in that light? Can you see evidence of that? Is the effect more pronounced in cities near lots of national forest? Otherwise you are just conjecturing.

    106. Weather conditions Wind speed and direction Temperature fluctuations Humidity levels Atmospheric pressure Solar radiation intensity

      Significantly affected? I thought you only saw a few of these at best as being significant contributors.

    107. That leaves us with three criteria gasses and all particulate matter.

      But again, these are just part of the definition of AQI aren't they? So of course they have a large impact?

    108. The largest source of carbon monoxide, nitrogen dioxide, and ozone is the cars, trucks, and other vehicles we use daily (Environmental Protection Agency). We can lower our reliance on personal vehicles by utilizing public transportation, carpooling, walking, biking, increasing work from home to lower commutes when available, and overall be more considerate about if driving a car is necessary.

      Did you see evidence of this? You had bus data. Did cities with less traffic show decreases in these values?

    109. Algorithm Dependence. This is the reliability of forecasts which are inherently tied to the chosen predictive algorithms. Different models may yield varying results, emphasizing the importance of algorithm selection and validation.

      So how did you choose your algorithms with this in mind?

    110. Industrial manufacturing processes and agriculture are significant polluters of the environment. We should invest in the research of more environmentally friendly manufacturing methods, working with materials that require less combustion, or are recyclable.

      Agreed, but I'm not sure you could see from your research if this was what was playing a large role?

    1. Air

      Your link to the right up here is broken

    2. Contents

      Your contents here has way too much stuff. Break it into bigger chunks and use subheadings

    3. On October 30, 1948, the Donora High School Football team played through a dense smog to complete the game with hundreds of fans in the audience, despite very poor visibility.

      Citation?

    4. (Jacobs, Burgess, Abbott)

      Ah, here it is. Given that you are pulling from the story for an entire paragraph, I'd lead with some reference to this source.

      "In their book/article published in XXXX, Jacobs, Burgess, and Abbott tell the tale of...."

    5. This event, known as the Donora Smog of 1948, prompted the country into taking a closer look at the negative impacts of air pollution. Widespread debate surrounding the event led to the first legislation aimed at regulating the air quality within the United States, ushering in a new era of tracking, combatting, and reversing the ill effects of poor air quality.

      Two sentences isn't much of a paragraph.

    6. changes

      factors