12 Matching Annotations
  1. Jan 2021
    1. I started using Cronicle a few weeks ago and really like it. Runs on a server... https://github.com/jhuckaby/Cronicle

      This also ticks a lot of my desired features.

      Really easy to set up if you already have node ready to go.

      UI is very slick and feels right to me out of the box.

      Multi-server support.

      Jobs can be assigned to categories. A given category can have max concurrent processes running at the same time (so run 1 backup task a time, even though 5 tasks are scheduled within the same time period). Individual tasks can also be set to be singleton or configurable max concurrency.

      Supports configurable retry (number of attempts, delay between).

      Supports optional catchup runs if runs are missed or queued runs.

      Supports killing and erroring out if timeouts or resource limits are hit.

      Time from download to first job setup... 2 minutes? Very intuitive UI.

      Has management API, not clear if it has an existing good CLI interface.

      Also supports setting up users to be able to run pre-defined scripts and see output.

      Need to figure out how to back-up and restore jobs.

    2. RUNDECK

      Very quick impression is this ticks a lot of my desired features.

      I'm not wild about the community edition default dashboard - I'd rather a more high level view of everything configured and its statuses.

      UI is clunky when compared to Cronicle. Lots of steps to get from setting it up to actually running something. No quick click from a run task to its log output. No good resources/stats view that I found.

      Like the fact it keeps track of logs, runtime (and can alert if runtime deviates from normal), gives you an estimated time to complete, lets you run on a schedule and/or manually.

      Like the fact it supports farming tasks off through SSH or other, or running them locally. Can auto-discover nodes using a script you provide (e.g. query AWS nodes) or using static config.

      Really interested in the multi-user capabilities. This may solve a problem I didn't really know I had at work (giving a semi-technical person access to kick off jobs or monitor them before asking me).

    3. huginn: great example of a user-friendly tool

      User-friendly in the sense it gives you a UI, but lots of clicks and UI to get something done.

      Also more of a "do random things not worth programming" tool, and less of a "run all of my scripts and processes" tool.

      Doesn't seem to fit well with the rest of the page.

      Reminds me of programming in Tasker on Android. My non-programmer friend automates (almost) everything in his life with it, but I try to use it and it frustrates the hell out of me and I just want a scripting language.

    4. Sadly, most of them are pretty heavy: often distributed, aiming at orchestrating clusters and grids, which is quite an overkill for my humble personal needs.

      That's been part of challenge. A lot of my desired functionality seems to lead to the same tools as people managing large clusters for company infrastructure.

      So it stops being easy to configure/manage because flexibility usually leads to massive cognitive overhead.

    5. Running all that manually (more than 100 scripts across all devices) is an awful job for a human. I want to set them up once and more or less forget about it, only checking now and then.

      My ideals for all of my regular processes and servers:

      • Centralized configuration and control - I want to go into a folder and configure everything I'm running everywhere.
      • Configuration file has the steps needed to set up from scratch - so I can just back up the configuration and data folders and not worry about backing up the programs.
      • Control multiple machines from the central location. Dictate where tasks can run.
      • [nice to have] Allow certain tasks to running externally, e.g. in AWS ECS or Lambda or similar
      • Command-line access for management (web is great for monitoring)
      • Flexible scheduling (from strict every minute to ~daily)
      • Support for daemons, psuedo-daemons (just run repeatedly with small delays), and periodic tasks.
      • Smart alerts - some processes can fail occasionally, but needs to run at least once per day - some processes should never fail. A repeating inaccurate alert is usually just as bad as no alert at all.
      • Error code respect (configurable)
      • Logs - store the program output, organize it, keep it probably in a date-based structure
      • Health checks - if it's a web server, is it still responding to requests? Has it logged something recently? Touched a database file? If not, it's probably dead.
      • Alerts support in Telegram and email
      • Monitor details about the run - how long did it take? How much CPU did it use? Has it gotten slower over time?
      • Dashboard - top-level stats, browse detailed run stats and logs

      So much of the configuration/control stuff screams containers, so more and more I'm using Docker for my scripts, even simpler ones.

      I'm pretty sure a lot of this is accomplished by existing Docker orchestration tools. Been delaying that rabbit hole for a long time.

      I think the key thing that makes this not just a "cron" problem for me, is I want something that monitors and manages both itself and the tasks I want to run, including creating/setting up if not already. I also want to ideally focus my mental energy into a single controller that handles my "keep this running" things all together, be they servers or infrequent tasks.

      Doesn't have to be a single project. Might be multiple pieces glued together somehow.

    1. on Google groups, they are meaningful and address specific discussions and messages:

      #! for AJAX crawlable pages. Of course this is only a signal for what the hash might mean, but I remember when Google was trying to make this a "thing". https://developers.google.com/search/docs/ajax-crawling/docs/getting-started?csw=1

    1. ¶Telegram

      Can a full backup of my Telegram be used such that a message I forward to Saved Messages can link back to the context in which the message was forwarded?

      Do I have to forward the context of the video link I'm saving for later, or is there enough metadata to back-link to where it came from?

    2. was planning to correlate them with monzo/HSBC transactions, but haven't got to it yet

      plaintext accounting, specifically beancount and beancount-import is my toolset of choice - it (beancount-import) has the power to figure out how to match up Amazon transactions to bank/credit card charges - even if the order's charges come in multiple separate charges. Looks like you're using ledger, though.

      Honorable mention to fava beancount web UI if you're looking for more reason to move over from ledger.

      I use Plaid in place of monzo for pulling bank transactions. No good API driven banks or credit cards here in the US.

    3. mount your Google Drive (e.g. via google-drive-ocamlfuse)

      I'm currently using rclone to fully sync my Google Drive to my NAS, and then ratarmount to make the archives mountable in a very fast (once the initial indexing of the tgz is completed) manner.

      Seriously, check out ratarmount if you haven't. Since the Google Takeout spans multiple 50GB tgz files (I'm at ~14, not including Google Drive in the takeout), ratarmount is brilliant. It merges all of the tgz contents into a single folder structure so /path/a/1.jpg and /path/a/1.json might be in different tgz folders but are mounted in to the same folder.

      Currently just for my unorganized "only in Google Photos" photos - since I just want to make sure I have a backup as I establish my desired archiving routine. I point my backup program (duplicacy) to the mounted tgz folder so the photos and JSON get entered into the backups. Just trying to cover my "Google locks me out and my ZFS pool dies in the same time period" bases.

    1. I want to argue very strongly against forcing the data in the database, unless it's really inevitable.

      At work I've been taking this approach in Amazon Web Services a lot as well, using Athena - Amazon's wrapper around Presto ("Distributed SQL Query Engine for Big Data").

      Basically, as long as your data is coming from the third party or your own systems as CSV or JSON, it's really easy to create a SQL wrapper around the data, and throw the power of Amazon's servers at it.

      Athena isn't a real-time database, but for so many of our use-cases (even with terabytes of data), it's fast enough, cheap enough, and so much easier to deal with than managing a "real" database.

      The ideas in this article about using DAL/DAO classes to abstract away third party formats are great. I was already in that mindset in some of my work and personal projects, but this article sparked some new ideas.

    1. To get a refresh token, you must include the offline_access scope when you initiate an authentication request through the /authorize endpoint.

      You also must enable Allow Offline Access in the configuration for the API in the Auth0 console.

    1. This is until you realize you're probably using at least ten different services, and they all have different purposes, with various kinds of data, endpoints and restrictions.

      10 is so low. Just counting the ones I want to actively back-up, I've made a list that's almost 30 entries long.