Hypothesis

61 Matching Annotations

Feb 2025
chatgpt.com chatgpt.com

ChatGPT - Research Software Sustainability Review

61
1. saross 16 Feb 2025
  
  in Public
  
  Future-Proofing and Actionable Insights
  
  (1) and (4) are the emphasis of our paper so far, but check against the dataset.
2. saross 16 Feb 2025
  
  in Public
  
  tacit knowledge
  
  Make the implicit explicit.
3. saross 16 Feb 2025
  
  in Public
  
  document for the future stranger
  
  Yes - also part of the low-friction transition equation.
4. saross 16 Feb 2025
  
  in Public
  
  loose coupling and clear APIs
  
  Kansa - should bring up the 'loose coupling' argument.
5. saross 16 Feb 2025
  
  in Public
  
  modularize functionality such that core research data and methods are decoupled from any one interface
  
  Yes, this is part of the low-friction transitions / data portability. Mention funcationality modularisation.
6. saross 16 Feb 2025
  
  in Public
  
  Designing with longevity in mind often means choosing simpler, well-supported technologies over cutting-edge but ephemeral ones.
  
  Again, perhaps we can test the 'simplicity' argument using the dataset we're building.
7. saross 16 Feb 2025
  
  in Public
  
  Software Management Plans (SMPs)
  
  Not a bad idea, haven't seen any in the wild
8. saross 16 Feb 2025
  
  in Public
  
  indicators of health (akin to how ecologists track species populations). For example, an indicator might be “active installations” of a software – if that number drops to zero, the software is effectively dead.
  
  Perhaps we can find some metric here that could be expressed in / derived from the dataset to show (a) maximum use of the tool, (b) when the tool died.
9. saross 16 Feb 2025
  
  in Public
  
  increasing reuse would naturally improve sustainability (through network effects and shared maintenance).
  
  This is key
10. saross 16 Feb 2025
  
  in Public
  
  Similarly, we have scant information on how often historical researchers reuse software developed by others. If reuse is low, perhaps because tools are hard to find or hard to learn, that could dampen incentives to sustain them. Some evidence suggests reuse is limited in humanities: Barats et al. (2020) hint that unlike sciences where shared tools (R, Python libraries) are common, humanities projects often start from scratch or custom-build
  
  REinventing the whell is a problem, in our dataset we should look for evidence of use, at least at an order-of-magnitude scale.
11. saross 16 Feb 2025
  
  in Public
  
  lost research or reanalysis needed
  
  Avoid at all costs.
  
  Anyway - highlight this lack in the 'future directions' section of the paper, and mention it in the section arguing for low-friction transitions.
12. saross 16 Feb 2025
  
  in Public
  
  We lack understanding of how end-users of historical software (e.g. historians using a text mining tool) deal with software obsolescence and what they need for continuity. Do they find workarounds? Do they abandon methods when software breaks?
  
  This sort of research would be useful, until then, focusing on reducing the friction of change from one tool to the next is the best we can do.
13. saross 16 Feb 2025
  
  in Public
  
  There is a need for lightweight assessment tools that account for the realities of these projects. For example, a “Sustainability Scorecard for Digital History Projects” could focus on a few key predictors (open source, multiple collaborators, archived in repository, etc.) and be easier to use than a full maturity model.
  
  Perhaps we could suggest something here based on what our dataset of softare looks like.
14. saross 16 Feb 2025
  
  in Public
  
  data preservation studies
  
  Perhaps use the 'fading away' article as a model.
15. saross 16 Feb 2025
  
  in Public
  
  empirical data on long-term outcomes
  
  Yay, this is our paper!
16. saross 16 Feb 2025
  
  in Public
  
  providing migration tools or services to move users to a new product
  
  Maybe some truncated version of this, where there are built-in archiving / export tools.
17. saross 16 Feb 2025
  
  in Public
  
  if a tool gains a sufficient user base (even if small but dedicated), it can leverage community contributions for maintenance. However, many humanities tools never reach that critical mass
  
  Yes, actual use is key
18. saross 16 Feb 2025
  
  in Public
  
  paying for a support contract means someone is on the hook to fix issues
  
  Yeah, this is what university IT departments want...
19. saross 16 Feb 2025
  
  in Public
  
  Backward compatibility is another hallmark of commercial practice: e.g., Adobe Photoshop today can still open files created 20+ years ago, thanks to consistent format support. In research software, backward compatibility often translates to data portability: making sure that data formats remain readable even if the software isn’t the same. The Endings Principles stress this by requiring data in standard formatsdh-tech.github.io – if a project’s custom software dies, another tool can potentially read the data.
  
  Ok, this was going to be the main point of the paper, need to talk to Peter Sefton about how to go past the results of the Ending Project.
  
  Require standard data formats, ensure data portability, if a project's custom softare dies, another tool can potentially read the data.
  
  Perhaps can extend by focusing on low-frinction transition to other tools, such as the use of bog standard formats and data bundled with metadata (like ROCrate).
20. saross 16 Feb 2025
  
  in Public
  
  build throwaway prototypes, then rebuild for longevity
  
  This seems like a good idea given practical constraints
21. saross 16 Feb 2025
  
  in Public
  
  digital humanities labs
  
  Perhaps getting better...
22. saross 16 Feb 2025
  
  in Public
  
  small-scale artisanal development
  
  See also 'crapple' license...generally poor RSE practice in DH
23. saross 16 Feb 2025
  
  in Public
  
  single-team developed tool
  
  Small scale, poorly resourced, no standards...
24. saross 16 Feb 2025
  
  in Public
  
  QGIS in archaeology
  
  Voyant, QGIS - large open-source tools the only ones that are really succeeding in archaeology / DH - Pareto distribution, I'm sure...
25. saross 16 Feb 2025
  
  in Public
  
  Best practices from software engineering
  
  Good RSE practices more common in STEM
26. saross 16 Feb 2025
  
  in Public
  
  infrastructure and community level
  
  In big data / big science fields, sustainability is approached at the infrastructure and community level.
27. saross 16 Feb 2025
  
  in Public
  
  project-centric
  
  Small data / small science problems, scale of DH software development rarely exceeds the individual project.
28. saross 16 Feb 2025
  
  in Public
  
  limitations of empirical work
  
  More longitudinal studies needed. Only strong hints and isolated metrics plus analogies to data in the literature at present. No comprehensive empirical model for software longevity.
29. saross 16 Feb 2025
  
  in Public
  
  Methodological Reflections
  
  Methods include surveys, repository mining, case studies and post-mortems, studies of dependency networks. Unsurprisingly, software with more users and uses survives longer (strengthening argument for FAIR software, see above). Broad user base, as with Voyant, is crucial.
30. saross 16 Feb 2025
  
  in Public
  
  integrated archival practice
  
  Static site, source doce with DOI (Zenodo + GitHub), descriptive metadata. Again, however, most of the examples are from collections not tools.
31. saross 16 Feb 2025
  
  in Public
  
  Repositories and Archival Sources
  
  Another GitHub study: Duckles et al (2020). Poorly documented. Allen et al 2019 looks at Zenodo and shows increase it its use after 2016. Internet Archive can be used to retrospectively study project websites.
32. saross 16 Feb 2025
  
  in Public
  
  Qualitative Studies and Project Post-mortems
  
  Projec tBamboo taken as a case study in unsustainability of DH.
33. saross 16 Feb 2025
  
  in Public
  
  Quantitative Studies on Software Lifespan
  
  Relative few: Vines et all 2014 looks at data (half-life of 6-7 years). Nielsen et al (2017) looked at biology papers and found that many could not be obtained a few years later. Endings Project has some data. Katz and Niemeryer (2019) look at Github repos and found that most have a short, bursty commit history (see also Howison and Herbsleb). Few are active beyond five yeras or attract multiple external contributors. Overall, short lifespan.
34. saross 16 Feb 2025
  
  in Public
  
  Bamboo
  
  Failed due to lack of governance, iterative development, and community buy-in - the last is probably the most common problem...
35. saross 16 Feb 2025
  
  in Public
  
  centralized institutional support
  
  Importance of shared infrastructure
36. saross 16 Feb 2025
  
  in Public
  
  King’s Digital Lab (KDL)
  
  KDL, tiered archiving framework, classifies each project based on importance and feasibility, some kept running, others are archived
37. saross 16 Feb 2025
  
  in Public
  
  Computational History project Stadt.Geschichte.Basel
  
  Applied Endings Principles, all research data in standard formats and website archived to statis HTML
38. saross 16 Feb 2025
  
  in Public
  
  ARIADNE
  
  Need to get DR to summarise the ARIADNE project
39. saross 16 Feb 2025
  
  in Public
  
  Methodologies and Tools for Sustainability
  
  Methodologies and tools * Endings Principles for Digital Longevity: data, documentation, processing, publication, and release management * 'Data' under Endings Principles is the closest to what we are arguing, recommending that all project data should be stored in open, non-proprietary formats, so no closed or obsolecent formats (follow up, isn't clear exactly what 'data' they are talking about - I assume it's data produced by the software, as we are discussing, but need to confirm). * Producing a static website end product not really relevant, shows how much of the literature is about collections rather than tools. * Goddard (2023) dark archving, web harvesting, emulation, continuous migration.
40. saross 16 Feb 2025
  
  in Public
  
  National and International Initiatives
  
  National and international initiatives - lots of good advice here: * Early planning for preseravation * Choosing appropriate OS licenses * engaging users to encourage co-development * better software practices (e.g., FAIR4RS) * software rachiving * support for reproducibility (what does this mean?) * Capacity and expertise building * Larger infrastructure orgs like data centers, libraries, eresearch institutes, etc) should be more active rather than individual project teams (IMPORTANT: comes from NL report, essentially trying to apply big data / big science solutions to small-science domain). * NL report explains how FAIR makes software more sustainable: easier to find and reuse, less likely to be lost or reinvented - I would add more likely to build a larger user base and get more people involved to provide or find resourcing)
41. saross 16 Feb 2025
  
  in Public
  
  Existing Frameworks and Methodologies
  
  Potential additional research querries: (1) national (and other) reports on software sustainability like the one from NL (2) Relationship of FAIR for RS and sustainability (3) Projects on deliberate end-of-life planning, like the Endings Project (4) Frameworks for assessing sustainability, like RSMM (5) Methods and tools for sustainability like the Endings Principles
42. saross 16 Feb 2025
  
  in Public
  
  Netherlands’ national report on research software sustainability
  
  Note the existence of national reports on research software sustainability. Are there others?
43. saross 16 Feb 2025
  
  in Public
  
  adopted by a wider community,
  
  'Intended for adoption by a wider community' might be part of our definition of a software tool, to eliminate one-offs by researchers. The criteria we are using to select software (e.g., publicaton about it or citing it) seem to select for this criteria.
44. saross 16 Feb 2025
  
  in Public
  
  Endings Project
  
  Endings project is crucial, as it's the only one I've found so far that actually promotes end-of-life planning.
45. saross 16 Feb 2025
  
  in Public
  
  Research Software Sustainability Maturity Model (RSMM)
  
  RSMM is important, but too new to assess whether it's had any impact.
46. saross 16 Feb 2025
  
  in Public
  
  FAIR Principles
  
  FAIR has, I think, limited applicability to longevity / sustainability. It helps make software more discoverable and reusable, which in turn migh tincrease uptake and make it more likely that the original creators or someone else will do the work to maintain the software. Otherwise, the key here is improved interoperability, which allows the software to be substituted with something else with as little friction as possible.
47. saross 14 Feb 2025
  
  in Public
  
  shares common fundamental issues with other domains (e.g. technology obsolescence, need for maintenance effort) but often without the same level of structural support; addressing these issues in a humanities context requires tailoring strategies to smaller projects and advocacy for institutional change.
  
  Small data / small science problem again
48. saross 14 Feb 2025
  
  in Public
  
  abandonment
  
  Keyword: abandonment
49. saross 14 Feb 2025
  
  in Public
  
  scientific software may be sustained as part of ongoing experimental operations or through agencies like NSF and DOE that mandate data/software management plans, whereas humanities software is frequently tied to one-time grants with less stringent post-project requirements
  
  Cross-domain differences in funding models
50. saross 14 Feb 2025
  
  in Public
  
  scale of projects often differs as well: fields like astronomy or genomics may create large, collaboratively developed software (with dozens of contributors and multi-year roadmaps), whereas a digital history project might be a small team effort
  
  Again, big vs. small, particularly challenging for small disciplines since work necessary for maintanence is concentrated, there are fewer standards / less standardisation, less shared infrastructure (more bespoke work / reinventing the wheel) and funding is less
51. saross 14 Feb 2025
  
  in Public
  
  making software and data FAIR (Findable, Accessible, Interoperable, Reusable) has become a “shared ambition” backed by concrete actiondigitalhumanities.org. By contrast, in humanities, such principles until recently remained more of a theoretical discussion than common practice
  
  'big science' vs 'small science' / big data vs small data
52. saross 14 Feb 2025
  
  in Public
  
  changes in culture
  
  culture change necessary around organisational and social context of research tools
53. saross 14 Feb 2025
  
  in Public
  
  knowledge custodian
  
  Brittle nature of the staffing around digital tools
54. saross 14 Feb 2025
  
  in Public
  
  Once a grant ends, there may be no dedicated budget to update the software or migrate it to newer platforms. This cyclic funding model – “forever or five years” as one commentary wryly put it (evoking Rothenberg’s famous quote that digital content lasts forever or five years)
  
  Grant funding is brief, but upkeep costs continue
55. saross 14 Feb 2025
  
  in Public
  
  persistence is a function of organizations, not a function of technology”
  
  Keyword: persistence
56. saross 14 Feb 2025
  
  in Public
  
  scholars hope their libraries will “adopt [the] project wholesale” and keep all components running indefinitely, which is typically not feasible
  
  tools need continual upkeep
57. saross 14 Feb 2025
  
  in Public
  
  Institutional challenges
  
  Institutional / socio-technical challenges. I'd probably separate 'techological challenges' and 'socio-technical challenges' in the lit review.
58. saross 14 Feb 2025
  
  in Public
  
  inevitable obsolescence of software dependencies and environments
  
  This is key: 'inevitable obsolescence of software dependencies and environments'
59. saross 14 Feb 2025
  
  in Public
  
  historical research outputs (digitized archives, databases, analytical tools) often need to endure far longer than the rapidly shifting technologies that support them
  
  I wouldn't quite word it this way, but the idea that research software tools need to endure longer than the technologies they are built from is an imporant one, and summarises why tools need continual investment and resourcing.
60. saross 14 Feb 2025
  
  in Public
  
  tension between long-term preservation needs and the short life cycles of software, data formats, and platforms
  
  Keywords: preservation, life-cycle
61. saross 14 Feb 2025
  
  in Public
  
  sustainability and long-term longevity
  
  Keywords for conventional lit search: sustainability, longevity
Visit annotations in context

Annotators

saross

URL

chatgpt.com/share/67aec308-c2d8-800c-a33b-a9b0024ca241

Annotators

URL