3,850 Matching Annotations
  1. Apr 2025
    1. Welcome back. Over the next few lessons and the wider course, we'll be covering storage a lot, and the exam expects you to know the appropriate type of storage to pick for a given situation. So before we move on to the AWS-specific storage lessons, I wanted to quickly do a refresher. So let's get started.

      Let's start by covering some key storage terms. First is direct attached or local attached storage. This is storage, so physical disks, which are connected directly to a device, so a laptop or a server. In the context of EC2, this storage is directly connected to the EC2 hosts and it's called the instance store. Directly attached storage is generally super fast because it's directly attached to the hardware, but it suffers from a number of problems. If the disk fails, the storage can be lost. If the hardware fails, the storage can be lost. If an EC2 instance moves between hosts, the storage can be lost.

      The alternative is network attached storage, which is where volumes are created and attached to a device over the network. In on-premises environments, this uses protocols such as iSCSI or Fiber Channel. In AWS, it uses a product called Elastic Blockstore known as EBS. Network storage is generally highly resilient and is separate from the instance hardware, so the storage can survive issues which impact the EC2 host.

      The next term is ephemeral storage and this is just temporary storage, storage which doesn't exist long-term, storage that you can't rely on to be persistent. And persistent storage is the next point, storage which exists as its own thing. It lives on past the lifetime of the device that it's attached to, in this case, EC2 instances. So an example of ephemeral storage, so temporary storage, is the instance store, so the physical storage that's attached to an EC2 host. This is ephemeral storage. You can't rely on it, it's not persistent. An example of persistent storage in AWS is the network attached storage delivered by EBS.

      Remember that, it's important for the exam. You will get questions testing your knowledge of which types of storage are ephemeral and persistent. Okay, next I want to quickly step through the three main categories of storage available within AWS. The category of storage defines how the storage is presented either to you or to a server and also what it can be used for.

      Now the first type is block storage. With block storage, you create a volume, for example, inside EBS and the red object on the right is a volume of block storage and a volume of block storage has a number of addressable blocks, the cubes with the hash symbol. It could be a small number of blocks or a huge number, that depends on the size of the volume, but there's no structure beyond that. Block storage is just a collection of addressable blocks presented either logically as a volume or as a blank physical hard drive.

      Generally when you present a unit of block storage to a server, so a physical disk or a volume, on top of this, the operating system creates a file system. So it takes the raw block storage, it creates a file system on top of this, for example, NTFS or EXT3 or many other different types of file systems and then it mounts that, either as a C drive in Windows operating systems or the root volume in Linux.

      Now block storage comes in the form of spinning hard disks or SSDs, so physical media that's block storage or delivered as a logical volume, which is itself backed by different types of physical storage, so hard disks or SSDs. In the physical world, network attached storage systems or storage area network systems provide block storage over the network and a simple hard disk in a server is an example of physical block storage. The key thing is that block storage has no inbuilt structure, it's just a collection of uniquely addressable blocks. It's up to the operating system to create a file system and then to mount that file system and that can be used by the operating system.

      So with block storage in AWS, you can mount a block storage volume, so you can mount an EBS volume and you can also boot off an EBS volume. So most EC2 instances use an EBS volume as their boot volume and that's what stores the operating system, and that's what's used to boot the instance and start up that operating system.

      Now next up, we've got file storage and file storage in the on-premises world is provided by a file server. It's provided as a ready-made file system with a structure that's already there. So you can take a file system, you can browse to it, you can create folders and you can store files on there. You access the files by knowing the folder structure, so traversing that structure, locating the file and requesting that file.

      You cannot boot from file storage because the operating system doesn't have low-level access to the storage. Instead of accessing tiny blocks and being able to create your own file system as the OS wants to, with file storage, you're given access to a file system normally over the network by another product. So file storage in some cases can be mounted, but it cannot be used for booting. So inside AWS, there are a number of file storage or file system-style products. And in a lot of cases, these can be mounted into the file system of an operating system, but they can't be used to boot.

      Now lastly, we have object storage and this is a very abstract system where you just store objects. There is no structure, it's just a flat collection of objects. And an object can be anything, it can have attached metadata, but to retrieve an object, you generally provide a key and in return for providing the key and requesting to get that object, you're provided with that object's value, which is the data back in return.

      And objects can be anything, there can be binary data, they can be images, they can be movies, they can be cat pictures, like the one in the middle here that we've got of whiskers. If they can be any data really that's stored inside an object. The key thing about object storage though is it is just flat storage. It's flat, it doesn't have a structure. You just have a container. In AWS's case, it's S3 and inside that S3 bucket, you have objects. But the benefits of object storage is that it's super scalable. It can be accessed by thousands or millions of people simultaneously, but it's generally not mountable inside a file system and it's definitely not bootable.

      So that's really important, you understand the differences between these three main types of storage. So generally in the on-premises world and in AWS, if you want to utilize storage to boot from, it will be block storage. If you want to utilize high performance storage inside an operating system, it will also be block storage. If you want to share a file system across multiple different servers or clients or have them accessed by different services, that can often be file storage. If you want large access to read and write object data at scale. So if you're making a web scale application, you're storing the biggest collection of cat pictures in the world, that is ideal for object storage because it is almost infinitely scalable.

      Now let's talk about storage performance. There are three terms which you'll see when anyone's referring to storage performance. There's the IO or block size, the input output operations per second, pronounced IOPS, and then the throughput. So the amount of data that can be transferred in a given second, generally expressed in megabytes per second.

      Now these things cannot exist in isolation. You can think of IOPS as the speed at which the engine of a race car runs at, the revolutions per second. You can think of the IO or block size as the size of the wheels of the race car. And then you can think of the throughput as the end speed of the race car. So the engine of a race car spins at a certain revolutions, whether you've got some transmission that affect that slightly, but that transmission, that power is delivered to the wheels and based on their size, that causes you to go at a certain speed.

      In theory in isolation, if you increase the size of the wheels or increase the revolutions of the engine, you would go faster. For storage and the analogy I just provided, they're all related to each other. The possible throughput a storage system can achieve is the IO or the block size multiplied by the IOPS.

      As we talk about these three performance aspects, keep in mind that a physical storage device, a hard disk or an SSD, isn't the only thing involved in that chain of storage. When you're reading or writing data, it starts with the application, then the operating system, then the storage subsystem, then the transport mechanism to get the data to the disk, the network or the local storage bus, such as SATA, and then the storage interface on the drive, the drive itself and the technology that the drive uses. There are all components of that chain. Any point in that chain can be a limiting factor and it's the lowest common denominator of that entire chain that controls the final performance.

      Now IO or block size is the size of the blocks of data that you're writing to disk. It's expressed in kilobytes or megabytes and it can range from pretty small sizes to pretty large sizes. An application can choose to write or read data of any size and it will either take the block size as a minimum or that data can be split up over multiple blocks as it's written to disk. If your storage block size is 16 kilobytes and you write 64 kilobytes of data, it will use four blocks.

      Now IOPS measures the number of IO operations the storage system can support in a second. So how many reads or writes that a disk or a storage system can accommodate in a second? Using the car analogy, it's the revolutions per second that the engine can generate given its default wheel size. Now certain media types are better at delivering high IOPS versus other media types and certain media types are better at delivering high throughput versus other media types. If you use network storage versus local storage, the network can also impact how many IOPS can be delivered. Higher latency between a device that uses network storage and the storage itself can massively impact how many operations you can do in a given second.

      Now throughput is the rate of data a storage system can store on a particular piece of storage, either a physical disk or a volume. Generally this is expressed in megabytes per second and it's related to the IO block size and the IOPS but it could have a limit of its own. If you have a storage system which can store data using 16 kilobyte block sizes and if it can deliver 100 IOPS at that block size, then it can deliver a throughput of 1.6 megabytes per second. If your application only stores data in four kilobyte chunks and the 100 IOPS is a maximum, then that means you can only achieve 400 kilobytes a second of throughput.

      Achieving the maximum throughput relies on you using the right block size for that storage vendor and then maximizing the number of IOPS that you pump into that storage system. So all of these things are related. If you want to maximize your throughput, you need to use the right block size and then maximize the IOPS. And if either of these three are limited, it can impact the other two. With the example on screen, if you were to change the 16 kilobyte block size to one meg, it might seem logical that you can now achieve 100 megabytes per second. So one megabyte times 100 IOPS in a second, 100 megabytes a second, but that's not always how it works. A system might have a throughput cap, for example, or as you increase the block size, the IOPS that you can achieve might decrease.

      As we talk about the different AWS types of storage, you'll become much more familiar with all of these different values and how they relate to each other. So you'll start to understand the maximum IOPS and the maximum throughput levels that different types of storage in AWS can deliver. And you might face exam questions where you need to answer what type of storage you will pick for a given level of performance demands. So it's really important as we go through the next few lessons that you pay attention to these key levels that I'll highlight.

      It might be, for example, that a certain type of storage can only achieve 1000 IOPS or 64000 IOPS. Or it might be that certain types of storage cap at certain levels of throughput. And you need to know those values for the exam so that you can know when to use a certain type of storage.

      Now, this is a lot of theory and I'm talking in the abstract and I'm mindful that I don't want to make this boring and it probably won't sink in and you won't start to understand it until we focus on some AWS specifics. So I am going to end this lesson here. I wanted to give you the foundational understanding, but over the next few lessons, you'll start to be exposed to the different types of storage available in AWS and you will start to paint a picture of when to pick particular types of storage versus others.

      So with that being said, that's everything I wanted to cover. I know this has been abstract, but it will be useful if you do the rest of the lessons in this section. I promise you this is going to be really valuable for the exam. So thanks for watching. Go ahead and complete the video. When you're ready, you can join me in the next.

    1. Welcome back—this is part two of this lesson, and we're going to continue immediately from the end of part one, so let's get started.

      Now, this is an overview of all of the different categories of instances, and then for each category, the most popular or current generation types that are available; I created this with the hope that it will help you retain this information.

      This is the type of thing that I would generally print out or keep an electronic copy of and refer to constantly as we go through the course—by doing so, whenever we talk about particular size and type and generation of instance, if you refer to the details in the notes column, you'll be able to start making a mental association between the type and then what additional features you get.

      So, for example, if we look at the general purpose category, we've got three main entries in that category: we've got the A1 and M6G types, and these are a specific type of instance that are based on ARM processors—so the A1 uses the AWS-designed Graviton ARM processor, and the M6G uses the generation 2, so Graviton 2 ARM-based processor.

      And using ARM-based processors, as long as you've got operating systems and applications that can run under the architecture, they can be very efficient—so you can use smaller instances with lower cost and achieve really great levels of performance.

      The T3 and T3A instance types are burstable instances, so the assumption with those types of instances is that your normal CPU load will be fairly low, and you have an allocation of burst credits that allows you to burst up to higher levels occasionally but then return to that normally low CPU level.

      So this type of instance—T3 and T3A—are really good for machines which have low normal loads with occasional bursts, and they're a lot cheaper than the other types of general purpose instances.

      Then we've got M5, M5A, and M5N—so M5 is your starting point, M5A uses the AMD architecture whereas normal M5s just use Intel, and these are your steady-state general instances.

      So if you don't have a burst requirement and you're running a certain type of application server which requires consistent steady-state CPU, then you might use the M5 type—maybe a heavily used Exchange email server that runs normally at 60% CPU utilization might be a good candidate for M5.

      But if you've got a domain controller or an email relay server that normally runs maybe at 2%, 3% with occasional bursts up to 20%, 30%, or 40%, then you might want to run a T-type instance.

      Now, not to go through all of these in detail, we've got the compute optimized category with the C5 and C5N, and they go for media encoding, scientific modeling, gaming servers, general machine learning.

      For memory optimized, we start off with R5 and R5A; if you want to use really large in-memory applications, you've got the X1 and the X1E; if you want the highest memory of all A-to-the-U instances, you've got the high memory series; and you've got the Z1D, which comes with large memory and NVMe storage.

      Then, Accelerated Computing—these are the ones that come with these additional capabilities, so the P3 type and G4 type come with different types of GPUs: the P type is great for parallel processing and machine learning, while the G type is kind of okay for machine learning and much better for graphics-intensive requirements.

      You've got the F1 type, which comes with field programmable gate arrays, which is great for genomics, financial analysis, and big data—anything where you want to program the hardware to do specific tasks.

      You've got the Inf1 type, which is relatively new, custom-designed for machine learning—so recommendation forecasting, analysis, voice conversation, anything machine learning-related, look at using that type.

      And then, storage-optimized instances—these come with high-speed local storage, and depending on the type you pick, you can get high throughput or maximum I/O or somewhere in between.

      So, keep this somewhere safe, print it out, keep it electronically, and as we go through the course and use the different types of instances, refer to this and start making the mental association between what a category is, what instance types are in that category, and then what benefits they provide.

      Now again, don't worry about memorizing all of this for the exam—you don't need it—I'll draw out anything specific that you need as we go through the course, but just try to get a feel for which letters are in which categories.

      If that's the minimum that you can do—if I can give you a letter like the T type, or the C type, or the R type—and you can try and understand the mental association with which category that goes into, that will be a great step.

      And there are ways we can do this—we can make these associations—so C stands for compute, R stands for RAM (which is a way for describing memory), we've got I which stands for I/O, D which stands for dense storage, G which stands for GPU, P which stands for parallel processing; there's lots of different mind tricks and mental associations that we can do, and as we go through the course, I'll try and help you with that.

      But as a minimum, either print this out or store it somewhere safe and refer to it as we go through the course.

      The key thing to understand though is how picking an instance type is specific to a particular type of computing scenario—so if you've got an application that requires maximum CPU, look at compute optimized; if you need memory, look at memory optimized; if you've got a specific type of acceleration, look at accelerated computing; start off in the general purpose instance types and then go out from there as you've got a particular requirement to.

      Now before we finish up, I did want to demonstrate two really useful sites that I refer to constantly—I'll include links to both of these in the lesson text.

      The first one is the Amazon documentation site for Amazon EC2 instance types—this gives you a follow-up view of all the different categories of EC2 instances.

      You can look in a category, a particular family and generation of instance—so T3—and then in there you can see the use cases that this is suited to, any particular features, and then a list of each instance size and exactly what allocation of resources that you get and then any particular notes that you need to be aware of.

      So this is definitely something you should refer to constantly, especially if you're selecting instances to use for production usage.

      This other website is something similar—it’s EC2instances.info—and it provides a really great sortable list which can be filtered and adjusted with different attributes and columns, which give you an overview of exactly what each instance provides.

      So you can either search for a particular type of instance—maybe a T3—and then see all the different sizes and capabilities of T3; as well as that, you can see the different costings for those instance types—so Linux on-demand, Linux reserved, Windows on-demand, Windows reserved—and we’ll talk about what this reserved column is later in the course.

      You can also click on columns and show different data for these different instance types, so if I scroll down, you can see which offer EBS optimization, you can see which operating systems these different instances are compatible with, and you've got a lot of options to manipulate this data.

      I find this to be one of the most useful third-party sites—I always refer back to this when I’m doing any consultancy—so this is a really great site.

      And again, it will go into the lesson text, so definitely as you’re going through the course, experiment and have a play around with this data, and just start to get familiar with the different capabilities of the different types of EC2 instances.

      With that being said, that’s everything I wanted to cover in this lesson—you’ve done really well, and there’s been a lot of theory, but it will come in handy in the exam and real-world usage.

      So go ahead, complete this video, and when you’re ready, you can join me in the next.

    1. Welcome back. In this lesson, now that we've covered virtualization at a high level, I want to focus on the architecture of the EC2 product in more detail. EC2 is one of the services you'll use most often in AWS since one which features on a lot of exam questions, so let's get started.

      First thing, let's cover some key, high level architectural points about EC2. EC2 instances are virtual machines, so this means an operating system plus an allocation of resources such as virtual CPU, memory, potential some local storage, maybe some network storage, and access to other hardware such as networking and graphics processing units. EC2 instances run on EC2 hosts, and these are physical servers hardware which AWS manages. These hosts are either shared hosts or dedicated hosts.

      Shared hosts are hosts which are shared across different AWS customers, so you don't get any ownership of the hardware and you pay for the individual instances based on how long you run them for and what resources they have allocated. It's important to understand, though, that every customer when using shared hosts are isolated from each other, so there's no visibility of it being shared, there's no interaction between different customers, even if you're using the same shared host, and shared hosts are the default.

      With dedicated hosts, you're paying for the entire host, not the instances which run on it. It's yours, it's dedicated to your account, and you don't have to share it with any other customers. So if you pay for a dedicated host, you pay for that entire host, you don't pay for any instances running on it, and you don't share it with other AWS customers.

      EC2 is an availability zone resilient service. The reason for this is that hosts themselves run inside a single availability zone, so if that availability zone fails, the hosts inside that availability zone could fail, and any instances running on any hosts that fail will themselves fail. So as a solutions architect, you have to assume if an AZ fails, then at least some and probably all of the instances that are running inside that availability zone will also fail or be heavily impacted.

      Now let's look at how this looks visually. So this is a simplification of the US East One region, I've only got two AZs represented, AZA and AZB, and in AZA, I've represented that I've got two subnet, subnet A and subnet B. Now inside each of these availability zones is an EC2 host. Now these EC2 hosts, they run within a single AZ, I'm going to keep repeating that because it's critical for the exam and you're thinking about EC2 in the exam.

      Keep thinking about it being an AZ resilient service, if you see EC2 mentioned in an exam, see if you can locate the availability zone details because that might factor into the correct answer. Now EC2 hosts have some local hardware, logically CPU and memory, which you should be aware of, but also they have some local storage called the instance store. The instance store is temporary, if an instance is running on a particular host, depending on the type of the instance, it might be able to utilize this instance store, but if the instance moves off this host to another one, then that storage is lost.

      And they also have two types of networking, storage networking and data networking. When instances are provisioned into a specific subnet within a VPC, what's actually happening is that a primary elastic network interface is provisioned in a subnet, which maps to the physical hardware on the EC2 host. Remember, subnets are also in one specific availability zone. Instances can have multiple network interfaces, even in different subnets, as long as they're in the same availability zone. Everything about EC2 is focused around this architecture, the fact that it runs in one specific availability zone.

      Now EC2 can make use of remote storage so an EC2 host can connect to the elastic block store, which is known as EBS. The elastic block store service also runs inside a specific availability zone, so the service running inside availability zone A is different than the one running inside availability zone B, and you can't access them cross zone. EBS lets you allocate volumes and volumes of portions of persistent storage, and these can be allocated to instances in the same availability zone, so again, it's another area where the availability zone matters.

      What I'm trying to do by keeping repeating availability zone over and over again is to paint a picture of a service which is very reliant on the availability zone that it's running in. The host is in an availability zone, the network is per availability zone, the persistent storage is per availability zone, even availability zone in AWS experiences major issues, it impacts all of those things.

      Now an instance runs on a specific host, and if you restart the instance, it will stay on a host. Instances stay on a host until one of two things happen: firstly, the host fails or is taken down for maintenance for some reason by AWS; or secondly, if an instance is stopped and then started, and that's different than just restarting, so I'm focusing on an instance being stopped and then being started, so not just a restart. If either of those things happen, then an instance will be relocated to another host, but that host will also be in the same availability zone.

      Instances cannot natively move between availability zones. Everything about them, their hardware, networking and storage is locked inside one specific availability zone. Now there are ways you can do a migration, but it essentially means taking a copy of an instance and creating a brand new one in a different availability zone, and I'll be covering that later in this section where I talk about snapshots and AMIs.

      What you can never do is connect network interfaces or EBS storage located in one availability zone to an EC2 instance located in another. EC2 and EBS are both availability zone services, they're isolated, you cannot cross AZs with instances or with EBS volumes. Now instances running on an EC2 host share the resources of that host. And instances of different sizes can share a host, but generally instances of the same type and generation will occupy the same host.

      And I'll be talking in much more detail about instance types and sizes and generations in a lesson that's coming up very soon. But when you think about an EC2 host, think that it's from a certain year and includes a certain class of processor and a certain type of memory and a certain type and configuration of storage. And instances are also created with different generations, different versions that you apply specific types of CPU memory and storage, so it's logical that if you provision two different types of instances, they may well end up on two different types of hosts.

      So a host generally has lots of different instances from different customers of the same type, but different sizes. So before we finish up this lesson, I want to answer a question. That question is what's EC2 good for? So what types of situations might you use EC2 for? And this is equally valuable when you're evaluating a technical architecture while you're answering questions in the exam.

      So first, EC2 is great when you've got a traditional OS and application compute need, so if you've got an application that requires to be running on a certain operating system at a certain runtime with certain configuration, maybe your internal technical staff are used to that configuration, or maybe your vendor has a certain set of support requirements, EC2 is a perfect use case for this type of scenario.

      And it's also great for any long running compute needs. There are lots of other services inside AWS that provide compute services, but many of these have got runtime limits, so you can't leave these things running consistently for one year or two years. With EC2, it's designed for persistent, long running compute requirements. So if you have an application that runs constantly 24/7, 365, and needs to be running on a normal operating system, Linux or Windows, then EC2 is the default and obvious choice for this.

      If you have any applications, which is server style applications, so traditional applications they expect to be running in an operating system, waiting for incoming connections, then again, EC2 is a perfect service for this. And it's perfect for any applications or services that need burst requirements or steady state requirements. There are different types of EC2 instances, which are suitable for low levels of normal loads with occasional bursts, as well as steady state load.

      So again, if your application needs an operating system, and it's not bursty needs or consistent steady state load, then EC2 should be the first thing that you review. EC2 is also great for monolithic application stack, so if your monolithic application requires certain components, a stack, maybe a database, maybe some middleware, maybe other runtime based components, and especially if it needs to be running on a traditional operating system, EC2 should be the first thing that you look at.

      And EC2 is also ideally suited for migrating application workloads, so application workloads, which expect a traditional virtual machine or server style environment, or if you're performing disaster recovery. So if you have existing traditional systems which run on virtual servers, and you want to provision a disaster recovery environment, then EC2 is perfect for that.

      In general, EC2 tends to be the default compute service within AWS. There are lots of niche requirements that you might have, and if you do have those, there are other compute services such as the elastic container service or Lambda. But generally, if you've got traditional style workloads, or you're looking for something that's consistent, or if it requires an operating system, or if it's monolithic, or if you migrated into AWS, then EC2 is a great default first option.

      Now in this section of the course, I'm covering the basic architectural components of EC2, so I'm gonna be introducing the basics and let you get some exposure to it, and I'm gonna be teaching you all the things that you'll need for the exam.

    1. Welcome back and in this first lesson of the EC2 section of the course, I want to cover the basics of virtualization as briefly as possible. EC2 provides virtualization as a service. It's an infrastructure as a service or I/O product. To understand all the value it provides and why some of the features work the way that they do, understanding the fundamentals of virtualization is essential. So that's what this lesson aims to do.

      Now, I want to be super clear about one thing. This is an introduction level lesson. There's a lot more to virtualization than I can talk about in this brief lesson. This lesson is just enough to get you started, but I will include a lot of links in the lesson description if you want to learn more. So let's get started.

      We do have a fair amount of theory to get through, but I promise when it comes to understanding how EC2 actually works, this lesson will be really beneficial. Virtualization is the process of running more than one operating system on a piece of physical hardware, a server. Before virtualization, the architecture looked something like this. A server had a collection of physical resources, so CPU and memory, network cards and maybe other logical devices such as storage. And on top of this runs a special piece of software known as an operating system.

      That operating system runs with a special level of access to the hardware. It runs in privilege mode, or more specifically, a small part of the operating system runs in privilege mode, known as the kernel. The kernel is the only part of the operating system, the only piece of software on the server that's able to directly interact with the hardware. Some of the operating system doesn't need this privilege level of access, but some of it does. Now, the operating system can allow other software to run such as applications, but these run in user mode or unprivileged mode. They cannot directly interact with the hardware, they have to go through the operating system.

      So if Bob or Julie are attempting to do something with an application, which needs to use the system hardware, that application needs to go through the operating system. It needs to make a system call. If anything but the operating system attempts to make a privileged call, so tries to interact with the hardware directly, the system will detect it and cause a system-wide error, generally crashing the whole system or at minimum the application. This is how it works without virtualization.

      Virtualization is how this is changed into this. A single piece of hardware running multiple operating systems. Each operating system is separate, each runs its own applications. But there's a problem, CPU at least at this point in time, could only have one thing running as privileged. A privileged process member has direct access to the hardware. And all of these operating systems, if they're running in their unmodified state, they expect to be running on their own in a privileged state. They contain privileged instructions. And so trying to run three or four or more different operating systems in this way will cause system crashes.

      Virtualization was created as a solution to this problem, allowing multiple different privileged applications to run on the same hardware. But initially, virtualization was really inefficient, because the hardware wasn't aware of it. Virtualization had to be done in software, and it was done in one of two ways. The first type was known as emulated virtualization or software virtualization. With this method, a host operating system still ran on the hardware and included additional capability known as a hypervisor. The software ran in privileged mode, and so it had full access to the hardware on the host server.

      Now, around the multiple other operating systems, which we'll now refer to as guest operating systems, were wrapped a container of sorts called a virtual machine. Each virtual machine was an unmodified operating system, such as Windows or Linux, with a virtual allocation of resources such as CPU, memory and local disk space. Virtual machines also had devices mapped into them, such as network cards, graphics cards and other local devices such as storage. The guest operating systems believed these to be real. They had drivers installed, just like physical devices, but they weren't real hardware. They were all emulated, fake information provided by the hypervisor to make the guest operating systems believe that they were real.

      The crucial thing to understand about emulator virtualization is that the guest operating systems still believe that they were running on real hardware, and so they still attempt to make privileged calls. They tried to take control of the CPU, they tried to directly read and write to what they think of as their memory and their disk, which are actually not real, they're just areas of physical memory and disk that have been allocated to them by the hypervisor. Without special arrangements, the system would at best crash, and at worst, all of the guests would be overriding each other's memory and disk areas.

      So the hypervisor, it performs a process known as binary translation. Any privileged operations which the guests attempt to make, they're intercepted and translated on the fly in software by the hypervisor. Now, the binary translation in software is the key part of this. It means that the guest operating systems need no modification, but it's really, really slow. It can actually halve the speed of the guest operating systems or even worse. Emulated virtualization was a cool set of features for its time, but it never achieved widespread adoption for demanding workloads because of this performance penalty.

      But there was another way that virtualization was initially handled, and this is called para-virtualization. With para-virtualization, the guest operating systems are still running in the same virtual machine containers with virtual resources allocated to them, but instead of the slow binary translation which is done by the hypervisor, another approach is used. Para-virtualization only works on a small subset of operating systems, operating systems which can be modified. Because with para-virtualization, there are areas of the guest operating systems which attempt to make privileged calls, and these are modified. They're modified to make them user calls, but instead of directly calling on the hardware, they're calls to the hypervisor called hypercalls.

      So areas of the operating systems which would traditionally make privileged calls directly to the hardware, they're actually modified. So the source code of the operating system is modified to call the hypervisor rather than the hardware. So the operating systems now need to be modified specifically for the particular hypervisor that's in use. It's no longer just generic virtualization, the operating systems are modified for the particular vendor performing this para-virtualization. By modifying the operating system this way, and using para-virtual drivers in the operating system for network cards and storage, it means that the operating system became almost virtualization aware, and this massively improved performance. But it was still a set of software processors designed to trick the operating system and/or the hardware into believing that nothing had changed.

      The major improvement in virtualization came when the physical hardware started to become virtualization aware. This allows for hardware virtualization, also known as hardware assisted virtualization. With hardware assisted virtualization, hardware itself has become virtualization aware. The CPU contains specific instructions and capabilities so that the hypervisor can directly control and configure this support, so the CPU itself is aware that it's performing virtualization. Essentially, the CPU knows that virtualization exists.

      What this means is that when guest operating systems attempt to run any privileged instructions, they're trapped by the CPU, which knows to expect them from these guest operating systems, so the system as a whole doesn't halt. But these instructions can't be executed as is because the guest operating system still thinks that it's running directly on the hardware, and so they're redirected to the hypervisor by the hardware. The hypervisor handles how these are executed. And this means very little performance degradation over running the operating system directly on the hardware.

      The problem, though, is while this method does help a lot, what actually matters about a virtual machine tends to be the input/output operation, so network transfer and disk I/O. The virtual machines, they have what they think is physical hardware, for example, a network card. But these cards are just logical devices using a driver, which actually connect back to a single physical piece of hardware which sits in the host. The hardware, everything is running on.

      Unless you have a physical network card per virtual machine, there's always going to be some level of software getting in the way, and when you're performing highly transactional activities such as network I/O or disk I/O, this really impacts performance, and it consumes a lot of CPU cycles on the host.

      The final iteration that I want to talk about is where the hardware devices themselves become virtualization aware, such as network cards. This process is called S-R-I-O-V, single root I/O virtualization. Now, I could talk about this process for hours about exactly what it does and how it works, because it's a very complex and feature-rich set of standards. But at a very high level, it allows a network card or any other add-on card to present itself, not just one single card, but almost a several mini-cards.

      Because this is supported in hardware, these are fully unique cards, as far as the hardware is concerned, and these are directly presented to the guest operating system as real cards dedicated for its use. And this means no translation has to happen by the hypervisor. The guest operating system can directly use its card whenever it wants. Now, the physical card which supports S-R-I-O-V, it handles this process end-to-end. It makes sure that when the guest operating system is used, there are logical mini-network cards that they have physical access to the physical network connection when required.

      In EC2, this feature is called enhanced networking, and it means that the network performance is massively improved. It means faster speeds. It means lower latency. And more importantly, it means consistent lower latency, even at high loads. It means less CPU usage for the host CPU, even when all of the guest operating systems are consuming high amounts of consistent I/O.

      Many of the features that you'll see EC2 using are actually based on AWS implementing some of the more advanced virtualization techniques that have been developed across the industry. AWS do have their own hypervisor stack now called Nitro, and I'll be talking about that in much more detail in an upcoming lesson, because that's what enables a lot of the higher-end EC2 features.

      But that's all the theory I wanted to cover. I just wanted to introduce virtualization at a high level and get you to the point where you understand what S-R-I-O-V is, because S-R-I-O-V is used for enhanced networking right now, but it's also a feature that can be used outside of just network cards. It can help hardware manufacturers design cards, which, whilst they're a physical single card, can be split up into logical cards that can be presented to guest operating systems. It essentially makes any hardware virtualization aware, and any of the advanced EC2 features that you'll come across within this course will be taking advantage of S-R-I-O-V.

      At this point, though, we've completed all of the theory I wanted to cover, so go ahead, complete the slicing when you're ready. You can join me in the next.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      Examination of (a)periodic brain activity has gained particular interest in the last few years in the neuroscience fields relating to cognition, disorders, and brain states. Using large EEG/MEG datasets from younger and older adults, the current study provides compelling evidence that age-related differences in aperiodic EEG/MEG signals can be driven by cardiac rather than brain activity. Their findings have important implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac signals is essential.

      We want to thank the editors for their assessment of our work and highlighting its importance for the understanding of aperiodic neural activity. Additionally, we want to thank the three present and four former reviewers (at a different journal) whose comments and ideas were critical in shaping this manuscript to its current form. We hope that this paper opens up many more questions that will guide us - as a field - to an improved understanding of how “cortical” and “cardiac” changes in aperiodic activity are linked and want to invite readers to engage with our work through eLife’s comment function.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate the effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. Based on their findings, the authors give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the aim of the study is well-motivated and analyses rigorously conducted, the overall structure of the manuscript, as it stands now, is partially misleading. Some of the described results are not well-embedded and lack discussion.

      We want to thank the reviewer for their comments focussed on improving the overall structure of the manuscript. We agree with their suggestions that some results could be more clearly contextualized and restructured the manuscript accordingly.

      Reviewer #2 (Public review):

      I previously reviewed this important and timely manuscript at a previous journal where, after two rounds of review, I recommended publication. Because eLife practices an open reviewing format, I will recapitulate some of my previous comments here, for the scientific record.

      In that previous review, I revealed my identity to help reassure the authors that I was doing my best to remain unbiased because I work in this area and some of the authors' results directly impact my prior research. I was genuinely excited to see the earlier preprint version of this paper when it first appeared. I get a lot of joy out of trying to - collectively, as a field - really understand the nature of our data, and I continue to commend the authors here for pushing at the sources of aperiodic activity!

      In their manuscript, Schmidt and colleagues provide a very compelling, convincing, thorough, and measured set of analyses. Previously I recommended that the push even further, and they added the current Figure 5 analysis of event-related changes in the ECG during working memory. In my opinion this result practically warrants a separate paper its own!

      The literature analysis is very clever, and expanded upon from any other prior version I've seen.

      In my previous review, the broadest, most high-level comment I wanted to make was that authors are correct. We (in my lab) have tried to be measured in our approach to talking about aperiodic analyses - including adopting measuring ECG when possible now - because there are so many sources of aperiodic activity: neural, ECG, respiration, skin conductance, muscle activity, electrode impedances, room noise, electronics noise, etc. The authors discuss this all very clearly, and I commend them on that. We, as a field, should move more toward a model where we can account for all of those sources of noise together. (This was less of an action item, and more of an inclusion of a comment for the record.)

      I also very much appreciate the authors' excellent commentary regarding the physiological effects that pharmacological challenges such as propofol and ketamine also have on non-neural (autonomic) functions such as ECG. Previously I also asked them to discuss the possibility that, while their manuscript focuses on aperiodic activity, it is possible that the wealth of literature regarding age-related changes in "oscillatory" activity might be driven partly by age-related changes in neural (or non-neural, ECG-related) changes in aperiodic activity. They have included a nice discussion on this, and I'm excited about the possibilities for cognitive neuroscience as we move more in this direction.

      Finally, I previously asked for recommendations on how to proceed. The authors convinced me that we should care about how the ECG might impact our field potential measures, but how do I, as a relative novice, proceed. They now include three strong recommendations at the end of their manuscript that I find to be very helpful.

      As was obvious from previous review, I consider this to be an important and impactful cautionary report, that is incredibly well supported by multiple thorough analyses. The authors have done an excellent job responding to all my previous comments and concerns and, in my estimation, those of the previous reviewers as well.

      We want to thank the reviewer for agreeing to review our manuscript again and for recapitulating on their previous comments and the progress the manuscript has made over the course of the last ~2 years. The reviewer's comments have been essential in shaping the manuscript into its current form. Their feedback has made the review process truly feel like a collaborative effort, focused on strengthening the manuscript and refining its conclusions and resulting recommendations.

      Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      We want to thank the reviewer for their thorough engagement with our work and the resultant substantive amount of great ideas both mentioned in the section of Weaknesses and Authors Recommendations below. Their suggestions have sparked many ideas in us on how to move forward in better separating peripheral- from neuro-physiological signals that are likely to greatly influence our future attempts to better extract both cardiac and muscle activity from M/EEG recordings. So we want to thank them for their input, time and effort!

      Weaknesses:

      Figure 4I: The regressions explained here seem to contain a very large number of potential predictors. Based on the way it is currently written, I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions?

      I'm not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including these latent contributions to the full signal back into the same regression model. It seems that there could be some circularity or redundancy in doing so. Can the authors provide a justification for why this is a valid approach?

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      I'm not sure whether there is good evidence or rationale to support the statement in the discussion that the presence of the ECG signal in reference electrodes makes it more difficult to isolate independent ECG components. The ICA algorithm will still function to detect common voltage shifts from the ECG as statistically independent from other voltage shifts, even if they're spread across all electrodes due to the referencing montage. I would suggest there are other reasons why the ICA might lead to imperfect separation of the ECG component (assumption of the same number of source components as sensors, non-Gaussian assumption, assumption of independence of source activities).

      The inclusion of only 32 channels in the EEG data might also have reduced the performance of ICA, increasing the chances of imperfect component separation and the mixing of cardiac artifacts into the neural components, whereas the higher number of sensors in the MEG data would enable better component separation. This could explain the difference between EEG and MEG in the ability to clean the ECG artifact (and perhaps higher-density EEG recordings would not show the same issue).

      The reviewer is making a good argument suggesting that our initial assumption that the presence of cardiac activity on the reference electrode influences the performance of the ICA may be wrong. After rereading and rethinking upon the matter we think that the reviewer is correct and that their assumptions for why the ECG signal was not so easily separable from our EEG recordings are more plausible and better grounded in the literature than our initial suggestion. We therefore now highlight their view as a main reason for why the ECG rejection was more challenging in EEG data. However, we also note that understanding the exact reason probably ends up being an empirical question that demands further research stating that:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      In addition to the inability to effectively clean the ECG artifact from EEG data, ICA and other component subtraction methods have also all been shown to distort neural activity in periods that aren't affected by the artifact due to the ubiquitous issue of imperfect component separation (https://doi.org/10.1101/2024.06.06.597688). As such, component subtraction-based (as well as regression-based) removal of the cardiac artifact might also distort the neural contributions to the aperiodic signal, so even methods to adequately address the cardiac artifact might not solve the problem explained in the study. This poses an additional potential confound to the "M/EEG without ECG" conditions.

      The reviewer is correct in stating that, if an “artifactual” signal is not always present but appears and disappears (like e.g. eye-blinks) neural activity may be distorted in periods where the “artifactual” signal is absent. However, while this plausibly presents a problem for ocular activity, there is no obvious reason to believe that this applies to cardiac activity. While the ECG signal is non-stationary in nature, it is remarkably more stable than eye-movements in the healthy populations we analyzed (especially at rest). Therefore, the presence of the cardiac “artifact” was consistently present across the entirety of the MEG recordings we visually inspected.

      Literature Analysis, Page 23: was there a method applied to address studies that report reducing artifacts in general, but are not specific to a single type of artifact? For example, there are automated methods for cleaning EEG data that use ICLabel (a machine learning algorithm) to delete "artifact" components. Within these studies, the cardiac artifact will not be mentioned specifically, but is included under "artifacts".

      The literature analysis was largely performed automatically and solely focussed on ECG related activity as described in the methods section under Literature Analysis, if no ECG related terms were used in the context of artifact rejection a study was flagged as not having removed cardiac activity. This could have been indeed better highlighted by us and we apologize for the oversight on our behalf. We now additionally link to these details stating that:

      “However, an analysis of openly accessible M/EEG articles (N<sub>Articles</sub>=279; see Methods - Literature Analysis for further details) that investigate aperiodic activity revealed that only 17.1% of EEG studies explicitly mention that cardiac activity was removed and only 16.5% measure ECG (45.9% of MEG studies removed cardiac activity and 31.1% of MEG studies mention that ECG was measured; see Figure 1EF).”

      The reviewer makes a fair point that there is some uncertainty here and our results probably present a lower bound of ECG handling in M/EEG research as, when I manually rechecked the studies that were not initially flagged in studies it was often solely mentioned that “artifacts” were rejected. However, this information seemed too ambiguous to assume that cardiac activity was in fact accounted for. However, again this could have been mentioned more clearly in writing and we apologize for this oversight. Now this is included as part of the methods section Literature Analysis stating that:

      “All valid word contexts were then manually inspected by scanning the respective word context to ensure that the removal of “artifacts” was related specifically to cardiac and not e.g. ocular activity or the rejection of artifacts in general (without specifying which “artifactual” source was rejected in which case the manuscript was marked as invalid). This means that the results of our literature analysis likely present a lower bound for the rejection of cardiac activity in the M/EEG literature investigating aperiodic activity.”

      Statistical inferences, page 23: as far as I can tell, no methods to control for multiple comparisons were implemented. Many of the statistical comparisons were not independent (or even overlapped with similar analyses in the full analysis space to a large extent), so I wouldn't expect strong multiple comparison controls. But addressing this point to some extent would be useful (or clarifying how it has already been addressed if I've missed something).

      In the present study we tried to minimize the risk of type 1 errors by several means, such as A) weakly informative priors, B) robust regression models and C) by specifying a region of practical equivalence (ROPE, see Methods Statistical Inference for further Information) to define meaningful effects.

      Weakly informative priors can lower the risk of type 1 errors arising from multiple testing by shrinking parameter estimates towards zero (see e.g. Lemoine, 2019). Robust regression models use a Student T distribution to describe the distribution of the data. This distribution features heavier tails, meaning it allocates more probability to extreme values, which in turn minimizes the influence of outliers. The ROPE criterion ensures that only effects exceeding a negligible size are considered meaningful, representing a strict and conservative approach to interpreting our findings (see Kruschke 2018, Cohen, 1988).

      Furthermore, and more generally we do not selectively report “significant” effects in the situations in which multiple analyses were conducted on the same family of data (e.g. Figure 2 & 4). Instead we provide joint inference across several plausible analysis options (akin to a specification curve analysis, Simonsohn, Simmons & Nelson 2020) to provide other researchers with an overview of how different analysis choices impact the association between cardiac and neural aperiodic activity.

      Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.

      Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4(11), 1208-1214.

      Methods:

      Applying ICA components from 1Hz high pass filtered data back to the 0.1Hz filtered data leads to worse artifact cleaning performance, as the contribution of the artifact in the 0.1Hz to 1Hz frequency band is not addressed (see Bailey, N. W., Hill, A. T., Biabani, M., Murphy, O. W., Rogasch, N. C., McQueen, B., ... & Fitzgerald, P. B. (2023). RELAX part 2: A fully automated EEG data cleaning algorithm that is applicable to Event-Related-Potentials. Clinical Neurophysiology, result reported in the supplementary materials). This might explain some of the lower frequency slope results (which include a lower frequency limit <1Hz) in the EEG data - the EEG cleaning method is just not addressing the cardiac artifact in that frequency range (although it certainly wouldn't explain all of the results).

      We want to thank the reviewer for suggesting this interesting paper, showing that lower high-pass filters may be preferable to the more commonly used >1Hz high-pass filters for detection of ICA components that largely contain peripheral physiological activity. However, the results presented by Bailey et al. contradict the more commonly reported findings by other researchers that >1Hz high-pass filter is actually preferable (e.g. Winkler et al. 2015; Dimingen, 2020 or Klug & Gramann, 2021) and recommendations in widely used packages for M/EEG analysis (e.g. https://mne.tools/1.8/generated/mne.preprocessing.ICA.html). Yet, the fact that there seems to be a discrepancy suggests that further research is needed to better understand which type of high-pass filtering is preferable in which situation. Furthermore, it is notable that all the findings for high-pass filtering in ICA component detection and removal that we are aware of relate to ocular activity. Given that ocular and cardiac activity have very different temporal and spectral patterns it is probably worth further investigating whether the classic 1Hz high-pass filter is really also the best option for the detection and removal of cardiac activity. However, in our opinion this requires a dedicated investigation on its own..

      We therefore highlight this now in our manuscript stating that:

      “Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)).

      Winkler, S. Debener, K. -R. Müller and M. Tangermann, "On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP," 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 2015, pp. 4101-4105, doi: 10.1109/EMBC.2015.7319296.

      Dimigen, O. (2020). Optimizing the ICA-based removal of ocular EEG artifacts from free viewing experiments. NeuroImage, 207, 116117.

      Klug, M., & Gramann, K. (2021). Identifying key factors for improving ICA‐based decomposition of EEG data in mobile and stationary experiments. European Journal of Neuroscience, 54(12), 8406-8420.

      It looks like no methods were implemented to address muscle artifacts. These can affect the slope of EEG activity at higher frequencies. Perhaps the Riemannian Potato addressed these artifacts, but I suspect it wouldn't eliminate all muscle activity. As such, I would be concerned that remaining muscle artifacts affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data (although I suspect it wouldn't reverse the overall conclusions given the number of converging results including in lower frequency bands). Is there a quick validity analysis the authors can implement to confirm muscle artifacts haven't negatively affected their results?

      I note that an analysis of head movement in the MEG is provided on page 32, but it would be more robust to show that removing ICA components reflecting muscle doesn't change the results. The results/conclusions of the following study might be useful for objectively detecting probable muscle artifact components: Fitzgibbon, S. P., DeLosAngeles, D., Lewis, T. W., Powers, D. M. W., Grummett, T. S., Whitham, E. M., ... & Pope, K. J. (2016). Automatic determination of EMG-contaminated components and validation of independent component analysis using EEG during pharmacologic paralysis. Clinical neurophysiology, 127(3), 1781-1793.

      We thank the reviewer for their suggestion. Muscle activity can indeed be a potential concern, for the estimation of the spectral slope. This is precisely why we used head movements (as also noted by the reviewer) as a proxy for muscle activity. We also agree with the reviewer that this is not a perfect estimate. Additionally, also the riemannian potato would probably only capture epochs that contain transient, but not persistent patterns of muscle activity.

      The paper recommended by the reviewer contains a clever approach of using the steepness of the spectral slope (or lack thereof) as an indicator whether or not an independent component (IC) is driven by muscle activity. In order to determine an optimal threshold Fitzgibbon et al. compared paralyzed to temporarily non paralyzed subjects. They determined an expected “EMG-free” threshold for their spectral slope on paralyzed subjects and used this as a benchmark to detect IC’s that were contaminated by muscle activity in non paralyzed subjects.

      This is a great idea, but unfortunately would go way beyond what we are able to sensibly estimate with our data for the following reasons. The authors estimated their optimal threshold on paralyzed subjects for EEG data and show that this is a feasible threshold to be applied across different recordings. So for EEG data it might be feasible, at least as a first shot, to use their threshold on our data. However, we are measuring MEG and as alluded to in our discussion section under “Differences in aperiodic activity between magnetic and electric field recordings” the spectral slope differs greatly between MEG and EEG recordings for non-trivial reasons. Furthermore, the spectral slope even seems to also differ across different MEG devices. We noticed this when we initially tried to pool the data recorded in Salzburg with the Cambridge dataset. This means we would need to do a complete validation of this procedure for the MEG data recorded in Cambridge and in Salzburg, which is not feasible considering that we A) don’t have direct access to one of the recording sites and B) would even if we had access face substantial hurdles to get ethical approval for the experiment performed by Fitzgibbon et al..

      However, we think the approach brought forward by Fitzgibbon and colleagues is a clever way to remove muscle activity from EEG recordings, whenever EMG was not directly recorded. We therefore suggested in the Discussion section that ideally also EMG should be recorded stating that:

      “It is worth noting that, apart from cardiac activity, muscle activity can also be captured in (non-)invasive recordings and may drastically influence measures of the spectral slope(72). To ensure that persistent muscle activity does not bias our results we used changes in head movement velocity as a control analysis (see Supplementary Figure S9). However, it should be noted that this is only a proxy for the presence of persistent muscle activity. Ideally, studies investigating aperiodic activity should also be complemented by measurements of EMG. Whenever such measurements are not available creative approaches that use the steepness of the spectral slope (or the lack thereof) as an indicator to detect whether or not e.g. an independent component is driven by muscle activity are promising(72,73). However, these approaches may require further validation to determine how well myographic aperiodic thresholds are transferable across the wide variety of different M/EEG devices.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) As outlined above, I recommend rephrasing the last section of the introduction to briefly summarize/introduce all main analysis steps undertaken in the study and why these were done (for example, it is only mentioned that the Cam-CAN dataset was used to study the impact of cardiac on MEG activity although the author used a variety of different datasets). Similarly, I am missing an overview of all main findings in the context of the study goals in the discussion. I believe clarifying the structure of the paper would not only provide a red thread to the reader but also highlight the efforts/strength of the study as described above.

      This is a good call! As suggested by the reviewer we now try to give a clearer overview of what was investigated why. We do that both at the end of the introduction stating that: “Using the publicly available Cam-CAN dataset(28,29), we find that the aperiodic signal measured using M/EEG originates from multiple physiological sources. In particular, significant portions of age-related changes in aperiodic activity –normally attributed to neural processes– can be better explained by cardiac activity. This observation holds across a wide range of processing options and control analyses (see Supplementary S1), and was replicable on a separate MEG dataset. However, the extent to which cardiac activity accounts for age-related changes in aperiodic activity varies with the investigated frequency range and recording site. Importantly, in some frequency ranges and sensor locations, age-related changes in neural aperiodic activity still prevail. But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging. In sum, our results highlight the complexity of aperiodic activity while cautioning against interpreting it as solely “neural“ without considering physiological influences.”

      and at the beginning of the discussion section:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources (see Figure 1EF). Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)). “

      (2) I found it interesting that the spectral slopes of ECG activity at higher frequency ranges (> 10 Hz) seem mostly related to HRV measures such as fractal and time domain indices and less so with frequency-domain indices. Do the authors have an explanation for why this is the case? Also, the analysis of the HRV measures and their association with aperiodic ECG activity is not explained in any of the method sections.

      We apologize for the oversight in not mentioning the HRV analysis in more detail in our methods section. We added a subsection to the Methods section entitled ECG Processing - Heart rate variability analysis to further describe the HRV analyses.

      “ECG Processing - Heart rate variability analysis

      Heart rate variability (HRV) was computed using the NeuroKit2 toolbox, a high level tool for the analysis of physiological signals. First, the raw electrocardiogram (ECG) data were preprocessed, by highpass filtering the signal at 0.5Hz using an infinite impulse response (IIR) butterworth filter(order=5) and by smoothing the signal with a moving average kernel with the width of one period of 50Hz to remove the powerline noise (default settings of neurokit.ecg.ecg_clean). Afterwards, QRS complexes were detected based on the steepness of the absolute gradient of the ECG signal. Subsequently, R-Peaks were detected as local maxima in the QRS complexes (default settings of neurokit.ecg.ecg_peaks; see (98) for a validation of the algorithm). From the cleaned R-R intervals, 90 HRV indices were derived, encompassing time-domain, frequency-domain, and non-linear measures. Time-domain indices included standard metrics such as the mean and standard deviation of the normalized R-R intervals , the root mean square of successive differences, and other statistical descriptors of interbeat interval variability. Frequency-domain analyses were performed using power spectral density estimation, yielding for instance low frequency (0.04-0.15Hz) and high frequency (0.15-0.4Hz) power components. Additionally, non-linear dynamics were characterized through measures such as sample entropy, detrended fluctuation analysis and various Poincaré plot descriptors. All these measures were then related to the slopes of the low frequency (0.25 – 20 Hz) and high frequency (10 – 145 Hz) aperiodic spectrum of the raw ECG.”

      With regards to association of the ECG’s spectral slopes at high frequencies and frequency domain indices of heart rate variability. Common frequency domain indices of heart rate variability fall in the range of 0.01-.4Hz. Which probably explains why we didn’t notice any association at higher frequency ranges (>10Hz).

      This is also stated in the related part of the results section:

      “In the higher frequency ranges (10 - 145 Hz) spectral slopes were most consistently related to fractal and time domain indices of heart rate variability, but not so much to frequency-domain indices assessing spectral power in frequency ranges < 0.4 Hz.”

      (3) Related to the previous point - what is being reflected in the ECG at higher frequency ranges, with regard to biological mechanisms? Results are being mentioned, but not further discussed. However, this point seems crucial because the age effects across the four datasets differ between low and high-frequency slope limits (Figure 2C).

      This is a great question that definitely also requires further attention and investigation in general (see also Tereshchenko & Josephson, 2015). We investigated the change of the slope across frequency ranges that are typically captured in common ECG setups for adults (0.05 - 150Hz, Tereshchenko & Josephson, 2015; Kusayama, Wong, Liu et al. 2020). While most of the physiological significant spectral information of an ECG recording rests between 1-50Hz (Clifford & Azuaje, 2006), meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz) that falls straight in our spectral analysis window. However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). Yet, the exact physiological mechanisms underlying so-called high-frequency QRS remain unclear (HF-QRS; see Tereshchenko & Josephson, 2015; Qiu et al. 2024 for a review discussing possible mechanisms). Yet, at the same time the HF-QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range (Schlegel et al. 2004; Qiu et al. 2024). All optimism aside, it is also worth noting that ECG recordings at higher frequencies can capture skeletal muscle activity with an overlapping frequency range up to 400Hz (Kusayama, Wong, Liu et al. 2020). We highlight all of this now when introducing this analysis in the results sections as outstanding research question stating that:

      “However, substantially less is known about aperiodic activity above 0.4Hz in the ECG. Yet, common ECG setups for adults capture activity at a broad bandwidth of 0.05 - 150Hz(33,34).

      Importantly, a lot of the physiological meaningful spectral information rests between 1-50Hz(35), similarly to M/EEG recordings. Furthermore, meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz(35)). However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). For instance, the so-called high-frequency QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range(36,37). Yet, the exact physiological mechanisms underlying the high-frequency QRS remain unclear (see (37) for a review discussing possible mechanisms). ”

      Tereshchenko, L. G., & Josephson, M. E. (2015). Frequency content and characteristics of ventricular conduction. Journal of electrocardiology, 48(6), 933-937.

      Kusayama, T., Wong, J., Liu, X. et al. Simultaneous noninvasive recording of electrocardiogram and skin sympathetic nerve activity (neuECG). Nat Protoc 15, 1853–1877 (2020). https://doi.org/10.1038/s41596-020-0316-6

      Clifford, G. D., & Azuaje, F. (2006). Advanced methods and tools for ECG data analysis (Vol. 10). P. McSharry (Ed.). Boston: Artech house.

      Qiu, S., Liu, T., Zhan, Z., Li, X., Liu, X., Xin, X., ... & Xiu, J. (2024). Revisiting the diagnostic and prognostic significance of high-frequency QRS analysis in cardiovascular diseases: a comprehensive review. Postgraduate Medical Journal, qgae064.

      Schlegel, T. T., Kulecz, W. B., DePalma, J. L., Feiveson, A. H., Wilson, J. S., Rahman, M. A., & Bungo, M. W. (2004, March). Real-time 12-lead high-frequency QRS electrocardiography for enhanced detection of myocardial ischemia and coronary artery disease. In Mayo Clinic Proceedings (Vol. 79, No. 3, pp. 339-350). Elsevier.

      (4) Page 10: At first glance, it is not quite clear what is meant by "processing option" in the text. Please clarify.

      Thank you for catching this! Upon re-reading this is indeed a bit oblivious. We now swapped “processing options” with “slope fits” to make it clearer that we are talking about the percentage of effects based on the different slope fits.

      (5) The authors mention previous findings on age effects on neural 1/f activity (References Nr 5,8,27,39) that seem contrary to their own findings such as e.g., the mostly steepening of the slopes with age. Also, the authors discuss thoroughly why spectral slopes derived from MEG signals may differ from EEG signals. I encourage the authors to have a closer look at these studies and elaborate a bit more on why these studies differ in their conclusions on the age effects. For example, Tröndle et al. (2022, Ref. 39) investigated neural activity in children and young adults, hence, focused on brain maturation, whereas the CamCAN set only considers the adult lifespan. In a similar vein, others report age effects on 1/f activity in much smaller samples as reported here (e.g., Voytek et al., 2015).

      I believe taking these points into account by briefly discussing them, would strengthen the authors' claims and provide a more fine-grained perspective on aging effects on 1/f.

      The reviewer is making a very important point. As age-related differences in (neuro-)physiological activity are not necessarily strictly comparable and entirely linear across different age-cohorts (e.g. age-related changes in alpha center frequency). We therefore, added the suggested discussion points to the discussion section.

      “Differences in electric and magnetic field recordings aside, aperiodic activity may not change strictly linearly as we are ageing and studies looking at younger age groups (e.g. <22; (44) may capture different aspects of aging (e.g. brain maturation), than those looking at older subjects (>18 years; our sample). A recent report even shows some first evidence of an interesting putatively non-linear relationship with age in the sensorimotor cortex for resting recordings(59)”

      (6) The analysis of the working memory paradigm as described in the outlook-section of the discussion comes as a bit of a surprise as it has not been introduced before. If the authors want to convey with this study that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity, I recommend introducing this analysis and the results earlier in the manuscript than only in the discussion to strengthen their message.

      The reviewer is correct. This analysis really comes a bit out of the blue. However, this was also exactly the intention for placing this analysis in the discussion. As the reviewer correctly noted, the aim was to suggest “that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity”. We placed this outlook directly after the discussion of “(neuro-)physiological origins of aperiodic activity”, where we highlight the potential challenges of interpreting drug induced changes to M/EEG recordings. So the aim was to get the reader to think about whether age is the only feature affected by cardiac activity and then directly present some evidence that this might go beyond age.

      However, we have been rethinking this approach based on the reviewers comments and moved that paragraph to the end of the results section accordingly and introduce it already at the end of the introduction stating that:

      “But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging.”

      (7) The font in Figure 2 is a bit hard to read (especially in D). I recommend increasing the font sizes where necessary for better readability.

      We agree with the Reviewer and increased the font sizes accordingly.

      (8) Text in the discussion: Figure 3B on page 10 => shouldn't it be Figure 4?

      Thank you for catching this oversight. We have now corrected this mistake.

      (9) In the third section on page 10, the Figure labels seem to be confused. For example, Figure 4 E is supposed to show "steepening effects", which should be Figure 4B I believe.

      Please check the figure labels in this section to avoid confusion.

      Thank you for catching this oversight. We have now corrected this mistake.

      (10) Figure Legend 4 I), please check the figure labels in the text

      Thank you for catching this oversight. We have now corrected this mistake.

      Reviewer #3 (Recommendations for the authors):

      I have a number of suggestions for improving the manuscript, which I have divided by section in the following:

      ABSTRACT:

      I would suggest re-writing the first sentences to make it easier to read for non-expert readers: "The power of electrophysiologically measured cortical activity decays with an approximately 1/fX function. The slope of this decay (i.e. the spectral exponent, X) is modulated..."

      Thank you for the suggestion. We adjusted the sentence as suggested to make it easier for less technical readers to understand that “X” refers to the exponent.

      Including the age range that was studied in the abstract could be informative.

      Done as suggested.

      As an optional recommendation, I think it would increase the impact of the article if the authors note in the abstract that the current most commonly applied cardiac artifact reduction approaches don't resolve the issue for EEG data, likely due to an imperfect ability to separate the cardiac artifact from the neural activity with independent component analysis. This would highlight to the reader that they can't just expect to address these concerns by cleaning their data with typical cleaning methods.

      I think it would also be useful to convey in the abstract just how comprehensive the included analyses were (in terms of artifact reduction methods tested, different aperiodic algorithms and frequency ranges, and both MEG and EEG). Doing so would let the reader know just how robust the conclusions are likely to be.

      This is a brilliant idea! As suggested we added a sentence highlighting that simply performing an ICA may not be sufficient to separate cardiac contributions to M/EEG recordings and refer to the comprehensiveness of the performed analyses.

      INTRODUCTION:

      I would suggest re-writing the following sentence for readability: "In the past, aperiodic neural activity, other than periodic neural activity (local peaks that rise above the "power-law" distribution), was often treated as noise and simply removed from the signal"

      To something like: "In the past, aperiodic neural activity was often treated as noise and simply removed from the signal e.g. via pre-whitening, so that analyses could focus on periodic neural activity (local peaks that rise above the "power-law" distribution, which are typically thought to reflect neural oscillations).

      We are happy to follow that suggestion.

      Page 3: please provide the number of articles that were included in the examination of the percentage that remove cardiac activity, and note whether the included articles could be considered a comprehensive or nearly comprehensive list, or just a representative sample.

      We stated the exact number of articles in the methods section under Literature Analysis. However, we added it to the Introduction on page 3 as suggested by the reviewer. The selection of articles was done automatically, dependent on a list of pre-specified terms and exclusively focussed on articles that had terms related to aperiodic activity in their title (see Literature Analysis). Therefore, I would personally be hesitant in calling it a comprehensive or nearly comprehensive list of the general M/EEG literature as the analysis of aperiodic activity is still relatively niche compared to the more commonly investigated evoked potentials or oscillations. I think whether or not a reader perceives our analysis as comprehensive should be up to them to decide and does not reflect something I want to impose on them. This is exacerbated by the fact that the analysis of neural aperiodic activity has rapidly gained traction over the last years (see Figure 1D orange) and the literature analysis was performed almost 2 years ago and therefore, in my eyes, only represents a glimpse in the rapidly evolving field related to the analysis of aperiodic activity.

      Figure 1E-F: It's not completely clear that the "Cleaning Methods" part of the figure indicates just methods to clean the cardiac artifact (rather than any artifact). It also seems that ~40% of EEG studies do not apply any cleaning methods even from within the studies that do clean the cardiac artifact (if I've read the details correctly). This seems unlikely. Perhaps there should be a bar for "other methods", or "unspecified"? Having said that, I'm quite familiar with the EEG artifact reduction literature, and I would be very surprised if ~40% of studies cleaned the cardiac artifact using a different method to the methods listed in the bar graph, so I'm wondering if I've misunderstood the figure, or whether the data capture is incomplete / inaccurate (even though the conclusion that ICA is the most common method is almost certainly accurate).

      The cleaning is indeed only focussed on cardiac activity specifically. This was however also mentioned in the caption of Figure 1: “We were further interested in determining which artifact rejection approaches were most commonly used to remove cardiac activity, such as independent component analysis (ICA(22)), singular value decomposition (SVD(23)), signal space separation (SSS(24)), signal space projections (SSP(25)) and denoising source separation (DSS(26)).” and in the methods section under Literature Analysis. However, we adjusted figure 1EF to make it more obvious that the described cleaning methods were only related to the ECG. Aside from using blind source separation techniques such as ICA a good amount of studies mentioned that they cleaned their data based on visual inspection (which was not further considered). Furthermore, it has to be noted that only studies were marked as having separated cardiac from neural activity, when this was mentioned explicitly.

      RESULTS:

      Page 6: I would delete the "from a neurophysiological perspective" clause, which makes the sentence more difficult to read and isn't so accurate (frequencies 13-25Hz would probably more commonly be considered mid-range rather than low or high). Additionally, both frequency ranges include 15Hz, but the next sentence states that the ranges were selected to avoid the knee at 15Hz, which seems to be a contradiction. Could the authors explain in more detail how the split addresses the 15Hz knee?

      We removed the “from a neurophysiological perspective” clause as suggested. With regards to the “knee” at ~15Hz I would like to defer the reviewer to Supplementary Figure S1. The Knee Frequency varies substantially across subjects so splitting the data at only 1 exact Frequency did not seem appropriate. Additionally, we found only spurious significant age-related variations in Knee Frequency (i.e. only one out of the 4 datasets; not shown).

      Furthermore, we wanted to better connect our findings to our MEG results in Figure 4 and also give the readers a holistic overview of how different frequency ranges in the aperiodic ECG would be affected by age. So to fulfill all of these objectives we decided to fit slopes with respective upper/lower bounds around a range of 5Hz above and below the average 15Hz Knee Frequency across datasets.

      The later parts of this same paragraph refer to a vast amount of different frequency ranges, but only the "low" and "high" frequency ranges were previously mentioned. Perhaps the explanation could be expanded to note that multiple lower and upper bounds were tested within each of these low and high frequency windows?

      This is a good catch we adjusted the sentence as suggested. We now write: “.. slopes were fitted individually to each subject's power spectrum in several lower (0.25 – 20 Hz) and higher (10-145 Hz) frequency ranges.”

      The following two sentences seem to contradict each other: "Overall, spectral slopes in lower frequency ranges were more consistently related to heart rate variability indices(> 39.4% percent of all investigated indices)" and: "In the lower frequency range (0.25 - 20Hz), spectral slopes were consistently related to most measures of heart rate variability; i.e. significant effects were detected in all 4 datasets (see Figure 2D)." (39.4% is not "most").

      The reviewer is correct in stating that 39.4% is not most. However, the 39.4% is the lowest bound and only refers to 1 dataset. In the other 3 datasets the percentage of effects was above 64% which can be categorized as “most” i.e. above 50%. We agree that this was a bit ambiguous in the sentence so we added the other percentages as well as a reference to Figure 2D to make this point clearer.

      Figure 2D: it isn't clear what the percentages in the semi-circles reflect, nor why some semi-circles are more full circles while others are only quarter circles.

      The percentages in the semi-circles reflect the amount of effects (marked in red) and null effects (marked in green) per dataset, when viewed as average across the different measures of HRV. Sometimes less effects were found for some frequency ranges resulting in quarters instead of semi circles.

      Page 8: I think the authors could make it more clear that one of the conditions they were testing was the ECG component of the EEG data (extracted by ICA then projected back into the scalp space for the temporal response function analysis).

      As suggested by the reviewer we adjusted our wording and replaced the arguably a bit ambiguous “... projected back separately” with “... projected back into the sensor space”. We thank the reviewer for this recommendation, as it does indeed make it easier to understand the procedure.

      “After pre-processing (see Methods) the data was split in three conditions using an ICA(22). Independent components that were correlated (at r > 0.4; see Methods: MEG/EEG Processing - pre-processing) with the ECG electrode were either not removed from the data (Figure 3ABCD - blue), removed from the data (Figure 2ABCD - orange) or projected back into the sensor space (Figure 3ABCD - green).”

      Figure 4A: standardized beta coefficients for the relationship between age and spectral slope could be noted to provide improved clarity (if I'm correct in assuming that is what they reflect).

      This was indeed shown in Figure 4A and noted in the color bar as “average beta (standardized)”. We do not specifically highlight this in the text, because the exact coefficients would depend on both on the analyzed frequency range and the selected electrodes.

      Figure 4I: The regressions explained at this point seems to contain a very large number of potential predictors, as I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions? (if that is not the case, it could be explained in greater detail). I'm also not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including them back into the same regression model. It seems that there could be some circularity or redundancy in doing so. However, I'm not confident that this is an issue, so would appreciate the authors explaining why it this is a valid approach (if that is the case).

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      The explanation of results for relationships between spectral slopes and aging reported in Figure 4 refers to clusters of effects, but the statistical inference methods section doesn't explain how these clusters were determined.

      The wording of “cluster” was used to describe a “category” of effects e.g. null effects. We changed the wording from “cluster” to “category” to make this clearer stating now that: “This analysis, which is depicted in Figure 4, shows that over a broad amount of individual fitting ranges and sensors, aging resulted in a steepening of spectral slopes across conditions (see Figure 4E) with “steepening effects” observed in 25% of the processing options in MEG<sub>ECG not rejected</sub> , 0.5% in MEG<sub>ECG rejected</sub>, and 60% for MEG<sub>ECG components</sub>. The second largest category of effects were “null effects” in 13% of the options for MEG<sub>ECG not rejected</sub> , 30% in MEG<sub>ECG rejected</sub>, and 7% for MEG<sub>ECG components</sub>. ”

      Page 12: can the authors clarify whether these age related steepenings of the spectral slope in the MEG are when the data include the ECG contribution, or when the data exclude the ECG? (clarifying this seems critical to the message the authors are presenting).

      We apologize for not making this clearer. We now write: “This analysis also indicates that a vast majority of observed effects irrespective of condition (ECG components, ECG not rejected, ECG rejected) show a steepening of the spectral slope with age across sensors and frequency ranges.”

      Page 13: I think it would be useful to describe how much variance was explained by the MEG-ECG rejected vs MEG-ECG component conditions for a range of these analyses, so the reader also has an understanding of how much aperiodic neural activity might be influenced by age (vs if the effects are really driven mostly by changes in the ECG).

      With regards to the explained variance I think that the very important question of how strong age influences changes in aperiodic activity is a topic better suited for a meta analysis. As the effect sizes seems to vary largely depending on the sample e.g. for EEG in the literature results were reported at r=-0.08 (Cesnaite et al. 2023), r=-0.26 (Cellier et al. 2021), r=-0.24/r=-0.28/r=-0.35 (Hill et al. 2022) and r=0.5/r=0.7 (Voytek et al. 2015). I would defer the reader/reviewer to the standardized beta coefficients as a measure of effect size in the current study that is depicted in Figure 4A.

      Cellier, D., Riddle, J., Petersen, I., & Hwang, K. (2021). The development of theta and alpha neural oscillations from ages 3 to 24 years. Developmental cognitive neuroscience, 50, 100969.

      Cesnaite, E., Steinfath, P., Idaji, M. J., Stephani, T., Kumral, D., Haufe, S., ... & Nikulin, V. V. (2023). Alterations in rhythmic and non‐rhythmic resting‐state EEG activity and their link to cognition in older age. NeuroImage, 268, 119810.

      Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076.

      Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-related changes in 1/f neural electrophysiological noise. Journal of Neuroscience, 35(38), 13257-13265.

      Also, if there are specific M/EEG sensors where the 1/f activity does relate strongly to age, it would be worth noting these, so future research could explore those sensors in more detail.

      I think it is difficult to make a clear claim about this for MEG data, as the exact location or type of the sensor may differ across manufacturers. Such a statement could be easier made for source projected data or in case EEG electrodes were available, where the location would be normed eg. according to the 10-20 system.

      DISCUSSION:

      Page 15: Please change the wording of the following sentence, as the way it is currently worded seems to suggest that the authors of the current manuscript have demonstrated this point (which I think is not the case): "The authors demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods."

      Apologies for the oversight! The reviewer is correct we in fact did not show this, but the authors of the cited manuscript. We correct the sentence as suggested stating now that:

      “Bénar et al. demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods.”

      Page 16: The authors mention the results can be sensitive to the application of SSS to clean the MEG data, but not ICA. I think it would be sensitive to the application of either SSS or ICA?

      This is correct and actually also supported by Figure S7, as differences in ICA thresholds affect also the detection of age-related effects. We therefore adjusted the related sentences stating now that:

      “ In case of the MEG signal this may include the application of Signal-Space-Separation algorithms (SSS(24,55)), different thresholds for ICA component detection (see Figure S7), high and low pass filtering, choices during spectral density estimation (window length/type etc.), different parametrization algorithms (e.g. IRASA vs FOOOF) and selection of frequency ranges for the aperiodic slope estimation.”

      It would be worth clarifying that the linked mastoid re-reference alone has been proposed to cancel out the ECG signal, rather than that a linked-mastoid re-reference improves the performance of the ICA separation (which could be inferred by the explanation as it's currently written).

      This is correct and we adjusted the sentence accordingly! Stating now that:

      “ Previous work(12,56) has shown that a linked mastoid reference alone was particularly effective in reducing the impact of ECG related activity on aperiodic activity measured using EEG. “

      The issue of the number of EEG channels could probably just be noted as a potential limitation, as could the issue of neural activity being mixed into the ECG component (although this does pose a potential confound to the M/EEG without ECG condition, I suspect it wouldn't be critical).

      This is indeed a very fair point as a higher amount of electrodes would probably make it easier to better isolate ECG components in the EEG, which may be the reason why the separation did not work so well in our case. However, this is ultimately an empirical question so we highlighted it in the discussion section stating that: “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      OUTLOOK:

      Page 19: Although there has been a recent trend to control for 1/f activity when examining oscillatory power, recent research suggests that this should only be implemented in specific circumstances, otherwise the correction causes more of a confound than the issue does. It might be worth considering this point with regards to the final recommendation in the Outlook section: Brake, N., Duc, F., Rokos, A., Arseneau, F., Shahiri, S., Khadra, A., & Plourde, G. (2024). A neurophysiological basis for aperiodic EEG and the background spectral trend. Nature Communications, 15(1), 1514.

      We want to thank the reviewer for recommending this very interesting paper! The authors of said paper present compelling evidence showing that, while peak detection above an aperiodic trend using methods like FOOOF or IRASA is a prerequisite to determine the presence of oscillatory activity, it’s not necessarily straightforward to determine which detrending approach should be applied to determine the actual power of an oscillation. Furthermore, the authors suggest that wrongfully detrending may cause larger errors than not detrending at all. We therefore added a sentence stating that: “However, whether or not periodic activity (after detection) should be detrended using approaches like FOOOF or IRASA still remains disputed, as incorrectly detrending the data may cause larger errors than not detrending at all(75).”

      RECOMMENDATIONS:

      Page 20: "measure and account for" seems like it's missing a word, can this be re-written so the meaning is more clear?

      Done as suggested. The sentence now states: “To better disentangle physiological and neural sources of aperiodic activity, we propose the following steps to (1) measure and (2) account for physiological influences.”

      I would re-phrase "doing an ICA" to "reducing cardiac artifacts using ICA" (this wording could be changed in other places also).

      I do not like to describe cardiac or ocular activity as artifactual per se. This is also why I used hyphens whenever I mention the word “artifact” in association with the ECG or EOG. However, I do understand that the wording of “doing an ICA” is a bit sloppy. We therefore reworded it accordingly throughout the manuscript to e.g. “separating cardiac from neural sources using an ICA” and “separating physiological from neural sources using an ICA”.

      I would additionally note that even if components are identified as unambiguously cardiac, it is still likely that neural activity is mixed in, and so either subtracting or leaving the component will both be an issue (https://doi.org/10.1101/2024.06.06.597688). As such, even perfect identification of whether components are cardiac or not would still mean the issue remains (and this issue is also consistent across a considerable range of component based methods). Furthermore, current methods including wavelet transforms on the ICA component still do not provide good separation of the artifact and neural activity.

      This is definitely a fair point and we also highlight this in our recommendations under 3 stating that:

      “However, separating physiological from neural sources using an ICA is no guarantee that peripheral physiological activity is fully removed from the cortical signal. Even more sophisticated ICA based methods that e.g. apply wavelet transforms on the ICA components may still not provide a good separation of peripheral physiological and neural activity76,77. This turns the process of deciding whether or not an ICA component is e.g. either reflective of cardiac or neural activity into a challenging problem. For instance, when we only extract cardiac components using relatively high detection thresholds (e.g. r > 0.8), we might end up misclassifying residual cardiac activity as neural. In turn, we can’t always be sure that using lower thresholds won’t result in misinterpreting parts of the neural effects as cardiac. Both ways of analyzing the data can potentially result in misconceptions.”

      Castellanos, N. P., & Makarov, V. A. (2006). Recovering EEG brain signals: Artifact suppression with wavelet enhanced independent component analysis. Journal of neuroscience methods, 158(2), 300-312.

      Bailey, N. W., Hill, A. T., Godfrey, K., Perera, M. P. N., Rogasch, N. C., Fitzgibbon, B. M., & Fitzgerald, P. B. (2024). EEG is better when cleaning effectively targets artifacts. bioRxiv, 2024-06.

      METHODS:

      Pre-processing, page 24: I assume the symmetric setting of fastica was used (rather than the deflation setting), but this should be specified.

      Indeed the reviewer is correct, we used the standard setting of fastICA implemented in MNE python, which is calling the FastICA implementation in sklearn that is per default using the “parallel” or symmetric algorithm to compute an ICA. We added this information to the text accordingly, stating that:

      “For extracting physiological “artifacts” from the data, 50 independent components were calculated using the fastica algorithm(22) (implemented in MNE-Python version 1.2; with the parallel/symmetric setting; note: 50 components were selected for MEG for computational reasons for the analysis of EEG data no threshold was applied).”

      Temporal response functions, page 26: can the authors please clarify whether the TRF is computed against the ECG signal for each electrode or sensory independently, or if all electrodes/sensors are included in the analysis concurrently? I'm assuming it was computed for each electrode and sensory separately, since the TRF was computed in both the forward and backwards direction (perhaps the meaning of forwards and backwards could be explained in more detail also - i.e. using the ECG to predict the EEG signal, or using the EEG signal to predict the ECG signal?).

      A TRF can also be conceptualized as a multiple regression model over time lags. This means that we used all channels to compute the forward and backward models. In the case of the forward model we predicted the signal of the M/EEG channels in a multivariate regression model using the ECG electrode as predictor. In case of the backward model we predicted the ECG electrode based on the signal of all M/EEG channels. The forward model was used to depict the time window at which the ECG signal was encoded in the M/EEG recording, which appears at 0 time lags indicating volume conduction. The backward model was used to see how much information of the ECG was decodable by taking the information of all channels.

      We tried to further clarify this approach in the methods section stating that:

      “We calculated the same model in the forward direction (encoding model; i.e. predicting M/EEG data in a multivariate model from the ECG signal) and backward direction (decoding model; i.e. predicting the ECG signal using all M/EEG channels as predictors).”

      Page 27: the ECG data was fit using a knee, but it seems the EEG and MEG data was not.

      Does this different pose any potential confound to the conclusions drawn? (having said this, Figure S4 suggests perhaps a knee was tested in the M/EEG data, which should perhaps be explained in the text also).

      This was indeed tested in a previous review round to ensure that our results are not dependent on the presence/absence of a knee in the data. We therefore added figure S4, but forgot to actually add a description in the text. We are sorry for this oversight and added a paragraph to S1 accordingly:

      “Using FOOOF(5), we also investigated the impact of different slope fitting options (fixed vs. knee model fits) on the aperiodic age relationship (see Supplementary Figure S4). The results that we obtained from these analyses using FOOOF offer converging evidence with our main analysis using IRASA.”

      Page 32: my understanding of the result reported here is that cleaning with ICA provided better sensitivity to the effects of age on 1/f activity than cleaning with SSS. Is this accurate? I think this could also be reported in the main manuscript, as it will be useful to researchers considering how to clean their M/EEG data prior to analyzing 1/f activity.

      The reviewer is correct in stating that we overall detected slightly more “significant” effects, when not additionally cleaning the data using SSS. However, I am a bit wary of recommending omitting the use of SSS maxfilter solely based on this information. It can very well be that the higher quantity of effects (when not employing SSS maxfilter) stems from other physiological sources (e.g. muscle activity) that are correlated with age and removed when applying SSS maxfiltering. I think that just conditioning the decision of whether or not maxfilter is applied based on the amount or size of effects may not be the best idea. Instead I think that the applicability of maxfilter for research questions related to aperiodic activity should be the topic of additional methodological research. We therefore now write in Text S1:

      “Considering that we detected less and weaker aperiodic effects when using SSS maxfilter is it advisable to omit maxfilter, when analyzing aperiodic signals? We don’t think that we can make such a judgment based on our current results. This is because it's unclear whether or not the reduction of effects stems from an additional removal of peripheral information (e.g. muscle activity; that may be correlated with aging) or is induced by the SSS maxfiltering procedure itself. As the use of maxfilter in detecting changes of aperiodic activity was not subject of analysis that we are aware of, we suggest that this should be the topic of additional methodological research.”

      Page 39, Figure S6 and Figure S8: Perhaps the caption could also briefly explain the difference between maxfilter set to false vs true? I might have missed it, but I didn't gain an understanding of what varying maxfilter would mean.

      Figure S6 shows the effect of ageing on the spectral slope averaged across all channels. The maxfilter set to false in AB) means that no maxfiltering using SSS was performed vs. in CD) where the data was additionally processed using the SSS maxfilter algorithm. We now describe this more clearly by writing in the caption:

      “Supplementary Figure S6: Age-related changes in aperiodic brain activity are most prominent on explained by cardiac components irrespective of maxfiltering the data using signal space separation (SSS) or not AC) Age was used to predict the spectral slope (fitted at 0.1-145Hz) averaged across sensors at rest in three different conditions (ECG components not rejected [blue], ECG components rejected [orange], ECG components only [green].”

    1. Welcome back to this lesson where I want to talk briefly about VPC Flow Logs, which are a useful networking feature of AWS VPCs, providing details of traffic flow within the private network. The most important thing to know about VPC Flow Logs is that they only capture packet metadata; they don't capture packet contents. If you need to capture the contents of packets, then you need a packet sniffer, something which you might install on an EC2 instance. So just to be really clear on this point, VPC Flow Logs only capture metadata, which means things like the source IP, the destination IP, the source and destination ports, packet size, and so on — anything which conceptually you could observe from outside, anything to do with the flow of data through the VPC.

      Now Flow Logs work by attaching virtual monitors within a VPC and these can be applied at three different levels. We can apply them at the VPC level, which monitors every network interface in every subnet within that VPC; at the subnet level, which monitors every interface within that specific subnet; and directly to interfaces, where they only monitor that one specific network interface.

      Now Flow Logs aren't real time — there's a delay between traffic entering or leaving monitored interfaces and showing up within VPC Flow Logs. This often comes up as an exam question, so this is something that you need to be aware of: you can't rely on Flow Logs to provide real-time telemetry on network packet flow, as there's a delay between that traffic flow occurring and that data showing up within the Flow Logs product.

      Now Flow Logs can be configured to go to multiple destinations — currently this is S3 and CloudWatch Logs. It's a preference thing, and each of these comes with their own trade-offs. If you use S3, you're able to access the log files directly and can integrate that with either a third-party monitoring solution or something that you design yourself. If you use CloudWatch Logs, then obviously you can integrate that with other products, stream that data into different locations, and access it either programmatically or using the CloudWatch Logs console. So that's important — that distinction you need to understand for the exam.

      You can also use Athena if you want to query Flow Logs stored in S3 using a SQL-like querying method. This is important if you have an existing data team and a more formal, rigorous review process of your Flow Logs. You can use Athena to query those logs in S3 and only pay for the amount of data read. Athena, remember, is an ad hoc querying engine which uses a schema-on-read architecture, so you're only billed for the data as it's read through the product and the data that's stored on S3 — that's critical to understand.

      Now visually, this is how the Flow Logs product is architected. We start with a VPC with two subnets — a public one on the right in green and a private one on the left in blue. This architecture is running the Categorum application and this specific implementation has an application server in the public subnet, which is accessed by our user Bob. The application uses a database within the private subnet, which has a primary instance as well as a replicated standby instance.

      Flow Logs can be captured, as I just mentioned, at a few different points — at the VPC level, at the subnet level, and directly on specific elastic network interfaces — and it's important to understand that Flow Logs capture from that point downwards. So any Flow Logs enabled at the VPC level will capture traffic metadata from every network interface in every subnet in that VPC; anything enabled at the subnet level is going to capture metadata for any network interfaces in that specific subnet, and so on.

      Flow Logs can be configured to capture metadata on only accepted connections, only on rejected connections, or on all connections. Visually, this is an example of a Flow Log configuration at the network interface level — it captures metadata from the single elastic network interface of the application instance within the public subnet. If we created something at the subnet level, for example the private subnet, then metadata from both of the database instances is captured as part of that configuration. Anything captured can be sent to a destination, and the current options are S3 and CloudWatch Logs.

      Now I'm going to be discussing this in detail in a moment, but the Flow Logs product captures what are known as Flow Log Records, and architecturally these look something like this. I'm going to be covering this next in detail — I'm going to step through all of the different fields just to give you a level of familiarity before you get the experience practically in a demo lesson. A VPC Flow Log is a collection of rows and each row has the following fields. All of the fields are important in different situations, but I've highlighted the ones that I find are used most often — source and destination IP address, source and destination port, the protocol, and the action.

      Consider this example: Bob is running a ping against an application instance inside AWS. Bob sends a ping packet to the instance and it responds — this is a common way to confirm connectivity and to assess the latency, so this is a good indication of the performance between two different internet-connected services. The Flow Log for this particular interaction might look something like this — I've highlighted Bob's IP address in pink and the server's private IP address in blue. This shows outward traffic from Bob to the EC2 instance — remember the order: source and destination, and that’s for both the IP addresses and the port numbers. Normally you would have a source and destination port number directly after that, but this is ping, so ICMP, which doesn't use ports, so that’s empty.

      The one highlighted in pink is the protocol number — ICMP is 1, TCP is 6, and UDP is 17. Now you don't really need to know this in detail for the exam, but it definitely will help you if you use VPC Flow Logs day to day, and it might feature as a point of elimination in an exam question, so do your best to remember the number for ICMP, TCP, and UDP.

      The second to last item indicates if the traffic was accepted or rejected — this indicates if it was blocked or not by a security group or a network access control list. If it's a security group, then generally only one line will show in the Flow Logs — remember security groups are stateful, so if the request is allowed, then the response is automatically allowed in return. What you might see is something like this, where you have one Flow Log record which accepts traffic and then another which rejects the response to that conversation.

      If you have an EC2 instance inside a subnet where the instance has a security group allowing pings from an external IP address, then the response will be automatically allowed. But if you have a network ACL on that instance's subnet which allows the ping inbound but doesn't allow it outbound, then it can cause a second line — a reject. It's important that you look out for both of these types of things in the exam, so if you see an accept and then a reject, and these look to be for the same flow of traffic, then you're going to be able to tell that both a security group and a network ACL are used and they're potentially restricting the flow of traffic between the source and the destination.

      Flow Logs show the results of traffic flows as they're evaluated — security groups are stateful and so they only evaluate the conversation itself, which includes the request and the response, while network ACLs are stateless and consider traffic flows as two separate parts, request and response, both of which are evaluated separately, so you might see two log entries within VPC Flow Logs.

      Now one thing before I finish up with this lesson: VPC Flow Logs don't log all types of traffic — there are some things which are excluded. This includes things such as the metadata service (so any accesses to the metadata service running inside the EC2 instance), time server requests, any DHCP requests which are running inside the VPC, and any communications with the Amazon Windows license server — obviously this applies only for Windows EC2 instances — so you need to be aware that certain types of traffic are not actually recorded using Flow Logs.

      Now we are going to have some demos elsewhere in the course where you are going to get some practical experience of working with Flow Logs, but this is all of the theory which I wanted to introduce within this lesson. At this point go ahead and complete this video, and when you're ready, I'll look forward to you joining me in the next.

    1. A

      Generate one English language annotation for a provided text passage, using a simple annotation mode.

      On a Friday afternoon, the school board and the teachers’ union reached an agreement, and students were told to return to school on Monday. [Resolution of conflict: End of strike/dispute] Two weeks of uncertainty, duality, and opposition had come to an end. [Duration of conflict] Some positions and relations had become hardened in the meantime—between the school board and the schools, the teachers and families, and the community among one another. [Consequences of conflict: Strained relationships] Remarks and political theater about who “won” and “lost,” who was “right” and “wrong,” were already circulating in the news, on social media, and among neighbors. [Post-conflict rhetoric and division]

      Binaries again. [Observation: Reinforcement of opposing viewpoints]

      But also, some unstuckness. [Observation: Potential for positive change]

      A few days before the agreement was announced, the parent-teacher organizations of two schools extended an open invitation to community members to be in relation together in a public space. Approximately fifty families attended. [Positive community initiative: Dialogue and collaboration] This “invitation for dialogue” between families, parents who were teachers, parents who were school board members, and city councilors—some of whom were also parents in the school system—came to matter as it created the conditions for unpredictable newness and difference to emerge in a way that finally began to usher in hope. [Impact of initiative: Building bridges and fostering hope]

      A children’s book can also be that invitational vibrant matter that brings people together and creates the conditions to be outside of some of the expired, stuck stories. [Metaphor: Children's books as agents of positive change] Children’s books can indeed be mirrors, windows, and sliding glass doors for readers, and they are also powerful agents in themselves, affecting space, a moment in time, and a community in visceral ways to become something different and new. [Concluding statement: The transformative power of children's literature]

    1. Welcome back and this lesson will be one of a number of lessons in this section of the course where I'll be covering identity federation within AWS. Now identity federation is the process of using an identity from another identity provider to access AWS resources.

      In AWS though, this can't be direct. Only AWS credentials can be used to access AWS resources. And so some form of exchange is required. And that's what I'll be covering in this lesson. So let's jump in and get started.

      Now before we talk about the architecture of SAML 2.0 Identity Federation, let's talk about SAML itself. So SAML stands for Security Assertion Markup Language. And SAML 2.0 is version two of this standard.

      Now it's an open standard which is used by many identity providers such as Microsoft with their Active Directory Federation Services known as ADFS, but many other on-premise identity providers utilize the SAML standard. And SAML 2.0 based federation allows you to indirectly use on-premises identities to access the AWS console and AWS command line interface.

      Now I want to stress the important point here and that's the word indirectly. You can't access AWS resources using anything but AWS credentials. And so the process of federation within AWS involves exchanging or swapping these external identities for valid AWS credentials.

      For the exam it's important that you both know how the architecture works as well as when you should select it versus the other identity federation options available within AWS.

      Now SAML 2.0 based identity federation is used when you currently use an enterprise-based identity provider which is also SAML 2.0 compatible. Both of these need to be true. You wouldn't use it with a Google identity provider and you wouldn't use it with one which isn't SAML 2.0 compatible. So focus on those selection criteria for the exam.

      Secondly, if you have an existing identity management team and you want to maintain that function allowing them to manage access to AWS as well, then SAML 2.0 based federation is ideal. Or if you're looking to maintain a single source of identity truth within your business and/or you have more than 5,000 users, then you should also look at using SAML 2.0 based federation.

      Now if you read a question in the exam which mentions Google, Facebook, Twitter, Web or anything which suggests that SAML 2.0 is not supported, then this is not the right type of identity federation to use. So keep that in mind when you're reviewing exam questions. If it mentions any of those terms, you should probably assume that SAML 2.0 identity federation is not the right thing to select.

      Now federation within AWS uses IAM roles and temporary credentials and in the case of SAML 2.0 based identity federation, these temporary credentials generally have up to a 12-hour validity. So keep that in mind for the exam.

      Now at this point I want to step through the architectural flow of exactly how SAML 2.0 based identity federation works, both when you're accessing using the API or CLI as well as the console UI. So let's do that next and we'll start with API or CLI access.

      So we start with an on-premises environment on the left using a SAML 2.0 compatible identity provider and identity store. And on the right is an AWS environment configured to allow SAML 2.0 based identity federation.

      What this actually means from an infrastructure perspective is that an identity provider exists on the left side and a SAML identity provider has been created within IAM on the right and a two-way trust has been established between the two. So the instance of IAM on the right has been configured to trust the identity provider on-premises within the environment on the left.

      Then we have an application, in this case the enterprise version of the Categorum application. And this is something that's developed internally within the Animals for Life organization and it initiates this process by communicating with the identity provider to request access.

      Once this process is initiated, the identity provider accesses the identity store and authenticates the request, pulling a list of roles which the identity used by the Categorum application has access to. Inside the identity provider you've mapped identities onto one or more roles and one identity might be able to use multiple different roles. And so the Categorum application will have a selection from which to pick.

      Now once this authentication process completes and the Categorum application has selected a role, what it gets back is what's known as a SAML assertion. And think of this as a token proving that the identity has authenticated. This SAML assertion is trusted by AWS.

      Now the key concept to realize so far is that the application has initiated this process by communicating with the identity provider. This is an application-initiated process.

      The next step is that the application communicates with AWS, specifically STS using the STS Assume Role with SAML operation. And it passes in the SAML assertion with that request. Now STS accepts the call and allows the role assumption. And this generates temporary AWS credentials which are returned to the application.

      And this is another critical stage because what's happened here is the SAML assertion has essentially been exchanged for valid temporary AWS credentials. And remember only AWS credentials can be used to access AWS resources. And so the Categorum application can now use these temporary credentials to interact with AWS services such as DynamoDB.

      So at a high level the process does require some upfront configuration. There needs to be a bidirectional trust created between IAM and the identity provider that's being used for this architecture. And once this trust has been established, AWS will respect these SAML assertions and use them as proof of authentication. And allow whatever identity is performing this process to gain access to temporary credentials which can be used to interact with AWS.

      Now this process occurs behind the scenes. This is the architecture that's used if you're developing these applications in a bespoke way inside your business. So if you're a developer and you're looking to utilize identity federation to access AWS, this is the type of architecture that you'll use.

      Now you can also use SAML 2.0 based identity federation to grant access to the AWS console for internal enterprise users. And on the whole it uses a very similar architectural flow. So let's have a look at that next.

      When we're using SAML based identity federation to provide console access, we still have the on-premises environment on the left and AWS on the right. We still have the same identity provider, for example ADFS, but this time it's a user who wants to access the AWS console rather than an application interacting with AWS products and services.

      There still needs to be a trust configured, this time between the identity provider and an SSO endpoint, also known as the AWS SAML endpoint. And this is configured within IAM inside the AWS account.

      Now to begin this process, our user Bob browsers to our identity provider portal. And this might look something like this URL. So this URL is for the Animals for Life ADFS server. Bob browsers to this URL and he sees a portal which he needs to interact with.

      Now when Bob loads this portal, before he can interact with it in any way, behind the scenes, the identity provider authenticates the request. Now this might mean explicitly logging in to the identity provider portal, or it might use the fact that you're already logged in to an active directory domain on your local laptop or workstation and use this authentication instead of asking you to log in again.

      But in either case, you'll be presented with a list of roles that you can use based on your identity. So there might be an admin role and normal role or even an auditing role and all of these provide different permissions to AWS.

      So once you've been authenticated or once you've selected a role, the identity provider portal returns a SAML assertion and instructions to point at a SAML endpoint that operates inside AWS.

      Now once the client receives this information, it sends the SAML assertion to the SAML endpoint that you've configured in advance within AWS. And this assertion proves that you've authenticated as your identity and it provides details on the access rights that you should receive.

      Now on your behalf in the background, the IAM role that you selected in the identity provider portal is now assumed using STS and the endpoint receives temporary security credentials on your behalf.

      Then at this point it creates a sign-in URL for the AWS console which includes those credentials and it delivers those back to the client which the client then uses to access the AWS console UI.

      So at a high level this is a fairly similar process. You're being authenticated by an identity provider that exists on premises. You're getting a SAML assertion. This is delivered to AWS. It's exchanged for temporary security credentials which are valid to interact with AWS resources.

      The only difference in this case is that the SAML endpoint is constructing a URL which can be used to access the AWS console UI which includes those credentials. And then once that URL has been created it's passed back to your client and your client uses it to access the AWS management console. And this all happens behind the scenes without you having any visibility of it.

      The key concept to understand and the really important thing about AWS Identity Federation is that you cannot use external identities directly to access AWS resources. They have to be exchanged for valid AWS credentials. And this is how that process happens when you're using a SAML 2.0 compatible identity provider.

      Now that's all of the architectural theory that I wanted to cover in this lesson. In other lessons in this section of the course you're going to get some additional exposure to other types of identity federation within AWS. But SAML 2.0 based identity federation is the one that tends to be used in larger enterprises, especially those with a Windows based identity provider.

      So that's everything I wanted to cover. So go ahead complete this video and when you're ready I'll look forward to you joining me in the next.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Kaya et al. studies the effect of food consumption on hippocampal sharp wave ripples (SWRs) in mice. The authors use multiple foods and forms of food delivery to show that the frequency and power of SWRs increase following food intake, and that this effect depends on the caloric content of food. The authors also studied the effects of the administration of various food-intake-related hormones on SWRs during sleep, demonstrating that ghrelin negatively affects SWR rate and power, but not GLP1, insulin, or leptin. Finally, the authors use fiber photometry to show that GABAergic neurons in the lateral hypothalamus, increase activity during a SWR event.

      Strengths:

      The experiments in this study seem to be well performed, and the data are well presented, visually. The data support the main conclusions of the manuscript that food intake enhances hippocampal SWRs. Taken together, this study is likely to be impactful to the study of the impact of feeding on sleep behavior, as well as the phenomena of hippocampal SWRs in metabolism.

      Weaknesses:

      Details of experiments are missing in the text and figure legends. Additionally, the writing of the manuscript could be improved.

      We thank the reviewer for their favorable assessment of the work and its potential impact. We have added all requested details in the text and figure legends and revised the wording of the manuscript to improve its clarity.

      Reviewer #2 (Public review):

      Summary:

      Kaya et al uncover an intriguing relationship between hippocampal sharp wave-ripple production and peripheral hormone exposure, food intake, and lateral hypothalamic function. These findings significantly expand our understanding of hippocampal function beyond mnemonic processes and point a direction for promising future research.

      Strengths:

      Some of the relationships observed in this paper are highly significant. In particular, the inverse relationship between GLP1/Leptin and Insulin/Ghrelin are particularly compelling as this aligns well with opposing hormone functions on satiety.

      Weaknesses:

      I would be curious if there were any measurable behavioral differences that occur with different hormone manipulations.

      We thank the reviewer for their favorable assessment of the work and its contribution to our understanding of non-mnemonic hippocampal function. Whether there are behavioral differences that occur following administration of the different hormones is a great question, yet unfortunately our study design did not include fine behavioral monitoring to the degree that would allow answering it. While some previous studies have partially addressed the behavioral consequences of the delivery of these hormones (and we reference these studies in our Discussion), how these changes may interact with the hippocampal and hypothalamic effects we observe is a very interesting next step.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Kaya et al. explores the effects of feeding on sharp wave-ripples (SWRs) in the hippocampus, which could reveal a better understanding of how metabolism is regulated by neural processes. Expanding on prior work that showed that SWRs trigger a decrease in peripheral glucose levels, the authors further tested the relationship between SWRs and meal consumption by recording LFPs from the dorsal CA1 region of the hippocampus before and after meal consumption. They found an increase in SWR magnitude during sleep after food intake, in both food restricted and ad libitum fed conditions. Using fiber photometry to detect GABAergic neuron activity in the lateral hypothalamus, they found increased activity locked to the onset of SWRs. They conclude that the animal's satiety state modulates the amplitude and rate of SWRs, and that SWRs modulate downstream circuits involved in regulating feeding. These experiments provide an important step forward in understanding how metabolism is regulated in the brain. However, currently, the paper lacks sufficient analyses to control for factors related to sleep quality and duration; adding these analyses would further support the claim that food intake itself, as opposed to sleep quality, is primarily responsible for changes in SWR activity. Adding this, along with some minor clarifications and edits, would lead to a compelling case for SWRs being modulated by a satiety state. The study will likely be of great interest in the field of learning and memory while carrying broader implications for understanding brain-body physiology.

      Strengths:

      The paper makes an innovative foray into the emerging field of brain-body research, asking how sharp wave-ripples are affected by metabolism and hunger. The authors use a variety of advanced techniques including LFP recordings and fiber photometry to answer this question. Additionally, they perform comprehensive and logical follow-up experiments to the initial food-restricted paradigm to account for deeper sleep following meal times and the difference between consumption of calories versus the experience of eating. These experiments lay the groundwork for future studies in this field, as the authors pose several follow-up questions regarding the role of metabolic hormones and downstream brain regions.

      We thank the reviewer for their appreciation and constructive review of the work.

      Weaknesses:

      Major comments:

      (1) The authors conclude that food intake regulates SWR power during sleep beyond the effect of food intake on sleep quality. Specifically, they made an attempt to control for the confounding effect of delta power on SWRs through a mediation analysis. However, a similar analysis is not presented for SWR rate. Moreover, this does not seem to be a sufficient control. One alternative way to address this confound would be to subsample the sleep data from the ad lib and food restricted conditions (or high calorie and low calorie, etc), to match the delta power in each condition. When periods of similar mean delta power (i.e. similar sleep quality) are matched between datasets, the authors can then determine if a significant effect on SWR amplitude and rate remains in the subsampled data.

      This is an important point that we believe we addressed in a few complementary ways. First, the mediation analysis we implemented measures the magnitude and significance of the contribution of food on SWR power after accounting for the effects of delta power, showing a highly significant food-SWR contribution. While the objective of subsampling is similar, mediation is a more statistically robust approach as it models the relationship between food, SWR power, and delta power in a way that explicitly accounts for the interdependence of these variables. Further, subsampling introduces the risk of losing statistical power by reducing the sample size, due to exclusion of data that might contain relevant and valuable information. Mediation analysis, on the other hand, uses the full dataset and retains statistical power while modeling the relationships between variables more holistically. However, as we were not satisfied with a purely analytical approach to test this issue, we carried out a new set of experiments in ad-libitum fed mice, where there is no concern of food restriction impairing sleep quality in the presleep session. In these conditions food amount also significantly correlated with, and showed significant mediation of, the SWR power change. Finally, we acknowledge and discuss this point in the Discussion, highlighting that given the known relationship between cortical delta and SWRs, it is challenging to fully disentangle these signals. 

      (2) Relatedly, are the animals spending the same amount of time sleeping in the ad lib vs. food restricted conditions? The amount of time spent sleeping could affect the probability of entering certain stages of sleep and thus affect SWR properties. A recent paper (Giri et al., Nature, 2024) demonstrated that sleep deprivation can alter the magnitude and frequency of SWRs. Could the authors quantify sleep quantity and control for the amount of time spent sleeping by subsampling the data, similar to the suggestion above?

      Following the reviewer’s comment, we have quantified and compared the amount of time spent in NREM sleep in the Pre and Post session pairs in which the animals were food restricted, with 0-1.5 g of chow given between the sleep sessions. We found that there was no significant difference in the amount of time spent in NREM sleep in the Pre and Post sessions. We have added this result to the Results section of the manuscript and as a new Supplementary Fig. 1. 

      Additionally, we have added details to the Methods section that were missing in the original submission that are relevant to this point. Specifically, within the sleep sessions, the ongoing sleep states were scored using the AccuSleep toolbox (https://github.com/zekebarger/AccuSleep) using the EEG and EMG signals. NREM periods were detected based on high EEG delta power and low EMG power, REM periods were detected based on high EEG theta power and low EMG power, and Wake periods were detected based on high EMG power. Importantly, only NREM periods were included for subsequent SWR detection, quantification and analyses (in particular, reported SWR rates reflect the number of SWRs per second of NREM sleep). 

      (3) Plot 5I only reports significance but does not clearly show the underlying quantification of LH GABAergic activity. Upon reading the methods for how this analysis was conducted, it would be informative to see a plot of the pre-SWR and post-SWR integral values used for the paired t-test whose p-values are currently shown. For example, these values could be displayed as individual points overlaid on a pair of boxand-whisker plots of the pre- and post-distribution within the session (perhaps for one example session per mouse with the p-value reported, to supplement a plot of the distribution of p-values across sessions and mice). If these data are non-normal, the authors should also use a non-parametric statistical test.

      We have generated the summary plots the reviewer requested and have now included them in Supplementary Fig. 2. 

      Minor comments:

      (4) A brief explanation (perhaps in the discussion) of what each change in SWR property (magnitude, rate, duration) could indicate in the context of the hypothesis may be helpful in bridging the fields of metabolism and memory. For example, by describing the hypothesized mechanistic consequence of each change, could the authors speculate on why ripple rate may not increase in all the instances where ripple power increases after feeding? Why do the authors speculate that ripple duration does not increase, given that prior work (Fernandez-Ruiz et al. 2019) has shown that prolonged ripples support enhanced memory?

      This is an interesting point and we have added a section to the Discussion to discuss it (pg. 17, last paragraph)

      (5) The authors suggest that "SWRs could modulate peripheral metabolism" as a future implication of their work. However, the lack of clear effects from GLP-1, leptin and insulin complicates this interpretation. It might be informative for readers if the authors expanded their discussion of what specific role they speculate that SWRs could play in regulating metabolism, given these negative results.

      We have added a section to the Discussion proposing potential reasons for this point (pg. 16, last paragraph)

      Recommendations for the authors:  

      Reviewer #1 (Recommendations for the authors):

      Major Comments:

      (1) The experiments involve very precise windows of time for sleeping and eating that seem impossible to control. For example, the authors state that for the experiments in Figure 1, there was a 2-h sleep period, followed by a 1-h feeding period, followed by another 2-h sleep period. Without sleep deprivation procedures or other environmental manipulations, how can these periods be so well-defined? Even during the inactive period, mice typically don't sleep for 2-h bouts at once, and the addition of food would not likely lead to an exact 1-h period of wakefulness in the middle. The validity of these experimental times would be more believable if the authors provided much more data on these sessions. For example, the authors could provide a table or visual display of data for the actual timing of the pre-sleep, eating, and post-sleep phases with exact time measurements and/or visual display of sleep versus wakefulness.

      This is an important point, which we were not clear enough about in the original submission. While the durations of the Pre-sleep, Wake and Post-sleep sessions were indeed 2 h, 1 h and 2 h respectively, the animals did not actually sleep during the entirety of the sleep sessions. Importantly, we performed sleep state scoring on all sessions, and only analyzed identified NREM sleep for all SWR analyses. Following the reviewer’s comment (and that of Reviewer 1), we have quantified and compared the amount of time spent in NREM sleep in the Pre and Post session pairs in which the animals were food restricted and 0-1.5 g of chow were given between the sleep sessions. We found that there was no significant difference in the amount of time spent in NREM sleep in the Pre and Post sessions. We have added this result to the Results section of the manuscript and as a new Supplementary Fig. 1. 

      Additionally, we have added details to the Methods section that were missing in the original submission that are relevant to this point. Specifically, within the sleep sessions, the ongoing sleep states were scored using the AccuSleep toolbox (https://github.com/zekebarger/AccuSleep) using the EEG and EMG signals. NREM periods were detected based on high EEG delta power and low EMG power, REM periods were detected based on high EEG theta power and low EMG power, and Wake periods were detected based on high EMG power. Importantly, only NREM periods were included for subsequent SWR detection, quantification and analyses (in particular, reported SWR rates reflect the number of SWRs per second of NREM sleep). 

      (2) I may have missed this (although I tried searching in the text and figure legend), but the authors did not state the difference between green versus red bar colors in Figure 1 C-E. For Figures 1 F-J, do the individual dots represent both the test (fed) animals and control animals, or just the test animals?

      We thank the reviewer for the opportunity to clarify these points. Red bars in Fig. 1C-E represent the SWR changes observed following delivery of equal or more than 0.5 g of chow, while the green bars represent the changes observed following delivery of less than 0.5 g. Fig. 1F-J includes both the experimental and control animals- the control animals appearing as having received 0 food amount. This information has now been added to the figure legend.

      (3) For the jello experiments in Figure 3, was there only 1 trial per animal? Previous studies show that animals learn the caloric value of jello after subsequent trials, so whether or not multiple trials took place in each animal is important for interpretation of the results.

      In Figure 3, the datapoints within each panel represent different animals and this information has now been added to the figure legend. Nevertheless, the animals were previously habituated to all foods, including regular jello, sugar-free jello and chocolate. While we consider it unlikely that this prior experience was sufficient to underlie the differential effects on SWRs, we cannot fully rule out the possibility that it provided some ability to predict the caloric value and consequences of the different foods. We have added details to the acknowledgement of this point in the Discussion (pg. 17, second paragraph).

      (4) The experiments in Figure 5 are informative but don't relate to the experiments in the rest of the study. It is difficult to interpret their meaning given that these experiments take place over seconds while the other experiments take place over hours. Some attempt should be made to bridge these experiments over the timescales relevant for the behaviors studied in Figures 1-4.

      We have now further acknowledged and discussed the point that our investigation is limited to the timescale of seconds around SWRs, and thus identified a potential communication channel, but whether and how this communication changes across hours following feeding remains for future studies (pg. 18, second paragraph).

      (5) Figure 5B should depict the x-axis in seconds, not an arbitrary set of times from a recording.

      We have replaced these with a time scale bar.

      Minor Comments:

      (6) The writing of the manuscript can be improved in many places:

      Sometimes the writing could be more precise. For example, the Abstract states: "hippocampal sharp wave ripples (SWRs)... have been shown to influence peripheral glucose metabolism." Could this be written in a more informative way, rather than just staying "has been shown to influence?" A few more words would provide a lot more information. Similarly, at the end of the Introduction: "we set out to test the hypothesis that SWRs are modulated following meal times as part of the systems-level response to changing metabolic needs." This is not a strong hypothesis... could it be written to boldly state how the SWRs will be modulated (increase or decrease) and provide more assertive information?

      The writing can be grandiose at times. Phrases such as "life is a continuous journey" or "the hypothalamus is a master regulator of homeostasis" are a bit sophomoric and too colloquial.

      Finally, a representative recording should be referred to as just that-a "representative recording," as opposed to a "snippet," which is also colloquial. This word is used in the figure legends to Figures 1 and 5, and misspelled as "sinpper" in Figure 1

      We have reworded all these sentences and phrases to make them clearer, more concrete and more formal.

      (7) The methods state that the study used both male and female mice. Were they used in equal numbers across experiments?

      Only one female was used in the final dataset, and we have corrected the wording accordingly.

      Reviewer #2 (Recommendations for the authors):

      Great paper!

      Thanks!

      Reviewer #3 (Recommendations for the authors):

      Below are some minor requests for clarification, including in figures:

      (1) Fig. 5H y-axis should say "normalized dF/F."

      Done

      (2) Fig. 1B is missing a y-axis label. It may be clearer to display separate y-axis scale bars for each component (SWR envelope, ripple-filtered amplitude, etc).

      Done

      (3) Please include labels for brain areas and methodological components in Fig. 5A.

      Done

      (4) Should Fig. 5B have the same y-axis or scale bars as 1B?

      We have edited the figure labels and legends to be visually similar

      (5) In Fig. 5J, is the y-axis a count of sessions?

      Yes, we have added that to the y-axis label

      (6) Could the authors please clarify whether the sugar-free jello was sweetened with an artificial sweetener? If so, this is a robust control for the rewarding nature of the two jellos, so a quick clarification would highlight this strength of the experiment.

      We thank the reviewer for this great point. Indeed, the sugar free jello contained artificial sweeteners (Aspartame and Acesulfame Potassium). We have added this information to the Results and Methods.

      (7) It appears in Fig. 5 that there may be a reliable dip in activity **at** the time of SWR onset, followed by the increase afterward, as shown in the example FP trace and the individual ripple-triggered traces. Is this indeed the case, and does this dip fall significantly below baseline? This characterization would be interesting, but I acknowledge is not necessarily crucial to the study to include.

      This would indeed be an interesting finding, but upon examination and statistical testing, we found that this is not the case. We believe this may appear as such due to the normalization of the traces.

      (8) The authors mention a reduction in ripple rate following insulin under food restriction as the only significant effect for insulin, GLP-1, and leptin, yet there was also a significant increase (at p<0.05) in ripple duration for GLP-1 in the ab lib condition. Is this not considered noteworthy?

      This is a fair point and we have reworded the description of this result to simply state that there were no robust, consistent, dose-dependent effects of GLP-1, leptin and insulin on SWR attributes.

    1. Voici un sommaire de la vidéo avec des indications temporelles basées sur le déroulement du contenu :

      • Introduction (Début de la vidéo) : L'introduction est faite par Elena, la fondatrice de Toadhouse Games. Elle explique que ce tutoriel est conçu pour les débutants qui n'ont aucune connaissance en codage et que les premières vidéos seront gratuites sur YouTube. Elle présente Ren'Py comme un moteur de roman visuel utilisé par des milliers de créateurs.

      • Qu'est-ce que Ren'Py ? (Environ 0:00 - 1:00) : Ren'Py est un moteur pour créer des romans visuels et de la fiction interactive. Bien qu'il fonctionne avec du code Python, il n'est pas nécessaire de savoir coder pour l'utiliser. Le logiciel fournit tout ce dont vous avez besoin, y compris des éditeurs de texte.

      • Téléchargement de Ren'Py (Environ 1:00 - 2:00) : Il faut se rendre sur ri.org et cliquer sur le bouton de téléchargement. Différentes versions sont disponibles pour Windows, Mac, Linux, Android et iOS. Une fois le fichier téléchargé, il faut l'exécuter et extraire les fichiers dans le dossier de votre choix.

      • Ouverture et présentation du lanceur Ren'Py (Environ 2:00 - 4:00) : Dans le dossier extrait, double-cliquez sur l'application Ren'Py (l'icône avec un anime) pour ouvrir le lanceur. Le lanceur affiche les projets ouverts (tutoriel et question par défaut) et les fichiers associés à chaque projet. Sur la droite, l'option "script" permet d'accéder aux fichiers de code, qui peuvent être édités dans un éditeur de texte comme Atom. Ren'Py peut télécharger et installer Atom pour vous.

      • Exploration des fichiers du projet (Environ 4:00 - 5:00) : Le dossier "game" contient tous les fichiers du jeu (audio, musique, images, etc.). Un raccourci vers le dossier "images" est également disponible. Le fichier "script" contient le code du jeu, y compris les dialogues, les transitions, la musique et les scènes. Les options et les écrans (screens) permettent de personnaliser l'apparence du jeu.

      • Construction et distribution du jeu (Environ 5:00 - 5:30) : L'option "build distributions" permet de créer une version jouable de votre jeu pour la partager avec d'autres sur différentes plateformes comme PC, Linux, Mac, itch.io ou Steam.

      • Exercice pratique avec le projet "The Question" (Environ 5:30 - 8:00) : Il est recommandé de sélectionner le projet "the question" et de lancer le projet pour jouer au jeu. Ensuite, ouvrez le script du projet "the question". L'exercice consiste à jouer au jeu tout en regardant le code correspondant dans l'éditeur de texte. Cela permet de comprendre comment le code contrôle le déroulement du jeu (musique, scènes, dialogues, choix). Il est possible de faire de petites modifications dans le script et de recharger le jeu pour voir les changements.

      • Présentation de Scrivener (Environ 8:00 - 9:00) : Scrivener est un logiciel optionnel qui peut être utilisé pour écrire le dialogue et organiser le contenu de votre roman visuel. Un modèle Ren'Py pour Scrivener créé par Toadhouse Games est disponible. Scrivener propose des conseils d'écriture de base et des modèles pour les profils de personnages et le code Ren'Py.

      • Conclusion (Environ 9:00 - Fin de la vidéo) : Elena encourage les spectateurs à commencer à expérimenter avec Ren'Py en modifiant le projet "the question". Des tutoriels plus avancés sur les "flags" et les choix seront proposés ultérieurement. Des ressources d'aide sont disponibles sur Twitter, par e-mail (teamtoadhouse@gmail.com), sur les subreddits et les forums Ren'Py, ainsi que sur le Discord de Toadhouse Games.

    1. Despite the guarantees of equality in the 14th Amendment, the Supreme Court’s landmark Plessy v. Ferguson decision in 1896 declared that the racial segregation of black Americans was constitutional. With the blessing of the nation’s highest court and no federal will to vindicate black rights, starting in the late 1800s, Southern states passed a series of laws and codes meant to make slavery’s racial caste system permanent by denying black people political power, social equality and basic dignity. They passed literacy tests to keep black people from voting and created all-white primaries for elections. Black people were prohibited from serving on juries or testifying in court against a white person. South Carolina prohibited white and black textile workers from using the same doors. Oklahoma forced phone companies to segregate phone booths. Memphis had separate parking spaces for black and white drivers. Baltimore passed an ordinance outlawing black people from moving onto a block more than half white and white people from moving onto a block more than half black. Georgia made it illegal for black and white people to be buried next to one another in the same cemetery. Alabama barred black people from using public libraries that their own tax dollars were paying for. Black people were expected to jump off the sidewalk to let white people pass and call all white people by an honorific, though they received none no matter how old they were. In the North, white politicians implemented policies that segregated black people into slum neighborhoods and into inferior all-black schools, operated whites-only public pools and held white and “colored” days at the country fair, and white businesses regularly denied black people service, placing “Whites Only” signs in their windows. States like California joined Southern states in barring black people from marrying white people, while local school boards in Illinois and New Jersey mandated segregated schools for black and white children.

      for a country that is always talking about freedom and equality, this is super ironic. black people were belittled and hated on so much.

  2. education.nationalgeographic.org education.nationalgeographic.org
    1. Safari ParksLarger than urban and open-range zoos, safari parks are areas where tourists can drive their own cars to see non-native wildlife living in large, enclosed areas. These attractions allow the animals more space than the small enclosures of traditional zoos.Fuji Safari Park, in Susono, Japan, offers a traditional zoo as well as a drive-through safari park. Visitors can take their own cars or one of the park’s buses. Fuji Safari Park offers night tours, so visitors can see nocturnal animals, or animals that are active at night. At the park, visitors can also feed some animals, such as lions, from bus windows. Not all parks encourage or even allow visitors to feed animals.Safari parks, especially in Europe, are often part of larger theme parks or resorts. They include golf courses and fairground attractions, such as games and rides.

      safari parks

  3. Mar 2025
    1. Reviewer #2 (Public review):

      Summary:

      The current dataset utilized a 2x2 factorial shuttle-escape task in combination with extracellular single-unit recording in the anterior cingulate cortex (ACC) of mice to determine ACC action coding. The contributions of neocortical signaling to action-outcome learning as assessed by behavioral tasks outside of the prototypical reward versus non-reward or punished vs non-punished is an important and relevant research topic, given that ACC plays a clear role in several human neurological and psychiatric conditions. The authors present useful findings regarding the role of ACC in action monitoring and learning. The core methods themselves - electrophysiology and behavior - are adequate; however, the analyses are incomplete since ruling out alternative explanations for neural activity, such as movement itself, requires substantial control analyses, and details on statistical methods are not clear.

      Strengths:

      (1) The factorial design nicely controls for sensory coding and value coding, since the same stimulus can signal different actions and values.

      (2) The figures are mostly well-presented, labeled, and easy to read.

      (3) Additional analyses, such as the 2.5/7.5s windows and place-field analysis, are nice to see and indicate that the authors were careful in their neural analyses.

      (4) The n-trial + 1 analysis where ACC activity was higher on trials that preceded correct responses is a nice addition, since it shows that ACC activity predicts future behavior, well before it happens.

      (5) The authors identified ACC neurons that fire to shuttle crossings in one direction or to crossings in both directions. This is very clear in the spike rasters and population-scaled color images. While other factors such as place fields, sensory input, and their integration can account for this activity, the authors discuss this and provide additional supplemental analyses.

      Weaknesses:

      (1) The behavioral data could use slightly more characterization, such as separating stay versus shuttle trials.

      (2) Some of the neural analyses could use the necessary and sufficient comparisons to strengthen the authors' claims.

      (3) Many of the neural analyses seem to utilize long time windows, not leveraging the very real strength of recording spike times. Specifics on the exact neural activity binning/averaging, tests, classifier validation, and methods for quantification are difficult to find.

      (4) The neural analyses seem to suggest that ACC neurons encode one variable or the other, but are there any that multiplex? Given the overwhelming evidence of multiplexing in the ACC a bit more discussion of its presence or absence is warranted.

    1. What operating system (iOS, Android, Windows, Linux, etc.) is used. What browser and browser version are used.

      I know that many people prefer different versions of software and various browsers. But is there any real significant difference between these pieces of software/browsers or are they all relatively the same? What make iOS so much different than what android uses? I often hear that Google Chrome is the best search engine to use. But why is this the case and are there other browsers that are just as good?

    1. A network operating system provides an environment in which users can access remote resources (implementing resource sharing) by either logging in to the appropriate remote machine or transferring data from the remote machine to their own machines. Currently, all general-purpose operating systems, and even embedded operating systems such as Android and IOS, are network operating systems. 19.4.1.1 Remote Login An important function of a network operating system is to allow users to log in remotely. The Internet provides the ssh facility for this purpose. To illustrate, suppose that a user at Westminster College wishes to compute on kristen.cs.yale.edu, a computer located at Yale University. To do so, the user must have a valid account on that machine. To log in remotely, the user issues the command ssh kristen.cs.yale.edu This command results in the formation of an encrypted socket connection between the local machine at Westminster College and the kristen.cs.yale.edu computer. After this connection has been established, the networking software creates a transparent, bidirectional link so that all characters entered by the user are sent to a process on kristen.cs.yale.edu and all the output from that process is sent back to the user. The process on the remote machine asks the user for a login name and a password. Once the correct information has been received, the process acts as a proxy for the user, who can compute on the remote machine just as any local user can. 19.4.1.2 Remote File Transfer Another major function of a network operating system is to provide a mechanism for remote file transfer from one machine to another. In such an environment, each computer maintains its own local file system. If a user at one site (say, Kurt at albion.edu) wants to access a file owned by Becca located on another computer (say, at colby.edu), then the file must be copied explicitly from the computer at Colby in Maine to the computer at Albion in Michigan. The communication is one-directional and individual, such that other users at those sites wishing to transfer a file, say Sean at colby.edu to Karen at albion.edu, must likewise issue a set of commands. The Internet provides a mechanism for such a transfer with the file transfer protocol (FTP) and the more private secure file transfer protocol (SFTP). Suppose that user Carla at wesleyan.edu wants to copy a file that is owned by Owen at kzoo.edu. The user must first invoke the sftp program by executing sftp owen@kzoo.edu The program then asks the user for a login name and a password. Once the correct information has been received, the user can use a series of commands to upload files, download files, and navigate the remote file system structure. Some of these commands are: get—Transfer a file from the remote machine to the local machine. put—Transfer a file from the local machine to the remote machine. ls or dir—List files in the current directory on the remote machine. cd—Change the current directory on the remote machine. There are also various commands to change transfer modes (for binary or ASCII files) and to determine connection status. 19.4.1.3 Cloud Storage Basic cloud-based storage applications allow users to transfer files much as with FTP. Users can upload files to a cloud server, download files to the local computer, and share files with other cloud-service users via a web link or other sharing mechanism through a graphical interface. Common examples include Dropbox and Google Drive. An important point about SSH, FTP, and cloud-based storage applications is that they require the user to change paradigms. FTP, for example, requires the user to know a command set entirely different from the normal operating-system commands. With SSH, the user must know appropriate commands on the remote system. For instance, a user on a Windows machine who connects remotely to a UNIX machine must switch to UNIX commands for the duration of the SSH session. (In networking, a session is a complete round of communication, frequently beginning with a login to authenticate and ending with a logoff to terminate the communication.) With cloud-based storage applications, users may have to log into the cloud service (usually through a web browser) or native application and then use a series of graphical commands to upload, download, or share files. Obviously, users would find it more convenient not to be required to use a different set of commands. Distributed operating systems are designed to address this problem.

      network operating system (NOS) provides basic functionalities like remote login and file transfer, allowing users to access resources on other machines within a network. It forms the foundation of many systems, enabling communication and resource sharing over a network. The system often requires users to manually interact with remote machines, using commands such as SSH for remote login or FTP for file transfers. While these systems offer basic functionality, they may lack the transparency and integration provided by distributed operating systems. Network operating systems, such as UNIX and Windows Server, often require explicit user commands to interact with remote systems, making them more complex for users. As technology advances, the lines between network and distributed systems blur, with modern network operating systems incorporating some features of distributed systems, such as cloud storage integration or remote access through APIs, which streamlines user interactions and enhances accessibility across distributed resources.

    2. 18.7.1 VMware VMware Workstation is a popular commercial application that abstracts Intel x86 and compatible hardware into isolated virtual machines. VMware Workstation is a prime example of a Type 2 hypervisor. It runs as an application on a host operating system such as Windows or Linux and allows this host system to run several different guest operating systems concurrently as independent virtual machines. The architecture of such a system is shown in Figure 18.9. In this scenario, Linux is running as the host operating system, and FreeBSD, WindowsNT, and WidowsXP are running as guest operating systems. At the heart of VMware is the virtualization layer, which abstracts the physical hardware into isolated virtual machines running as guest operating systems. Each virtual machine has its own virtual CPU, memory, disk drives, network interfaces, and so forth.

      VMware Workstation is a widely used Type 2 hypervisor that enables multiple virtual machines to run on a host operating system like Linux or Windows. It abstracts hardware resources, allowing guest operating systems to function independently with dedicated virtual components such as CPU, memory, and disk. This virtualization enhances resource utilization, system efficiency, and disaster recovery. A key feature is the ability to copy guest operating system files, enabling easy backups and migrations. The architecture consists of a virtualization layer between the host OS and hardware, ensuring seamless integration of multiple environments while maintaining isolation and performance across virtual machines.

    3. Type 2 hypervisors are less interesting to us as operating-system explorers, because there is very little operating-system involvement in these application-level virtual machine managers. This type of VMM is simply another process run and managed by the host, and even the host does not know that virtualization is happening within the VMM. Type 2 hypervisors have limits not associated with some of the other types. For example, a user needs administrative privileges to access many of the hardware assistance features of modern CPUs. If the VMM is being run by a standard user without additional privileges, the VMM cannot take advantage of these features. Due to this limitation, as well as the extra overhead of running a general-purpose operating system as well as guest operating systems, type 2 hypervisors tend to have poorer overall performance than type 0 or type 1. As is often the case, the limitations of type 2 hypervisors also provide some benefits. They run on a variety of general-purpose operating systems, and running them requires no changes to the host operating system. A student can use a type 2 hypervisor, for example, to test a non-native operating system without replacing the native operating system. In fact, on an Apple laptop, a student could have versions of Windows, Linux, Unix, and less common operating systems all available for learning and experimentation.

      Type 2 hypervisors run as applications on a host operating system, making them less efficient but more accessible. They allow users to run different operating systems without modifying the host OS, making them useful for testing and educational purposes. However, they have limited hardware access and higher overhead due to the need to run within a general-purpose OS. Performance is lower compared to Type 0 and Type 1 hypervisors. The section highlights how Type 2 hypervisors are convenient for individual users but are rarely used in large-scale virtualization environments.

    4. 18.5.3 Type 1 Hypervisor Type 1 hypervisors are commonly found in company data centers and are, in a sense, becoming “the data-center operating system.” They are special-purpose operating systems that run natively on the hardware, but rather than providing system calls and other interfaces for running programs, they create, run, and manage guest operating systems. In addition to running on standard hardware, they can run on type 0 hypervisors, but not on other type 1 hypervisors. Whatever the platform, guests generally do not know they are running on anything but the native hardware. Type 1 hypervisors run in kernel mode, taking advantage of hardware protection. Where the host CPU allows, they use multiple modes to give guest operating systems their own control and improved performance. They implement device drivers for the hardware they run on, since no other component could do so. Because they are operating systems, they must also provide CPU scheduling, memory management, I/O management, protection, and even security. Frequently, they provide APIs, but those APIs support applications in guests or external applications that supply features like backups, monitoring, and security. Many type 1 hypervisors are closed-source commercial offerings, such as VMware ESX, while some are open source or hybrids of open and closed source, such as Citrix XenServer and its open Xen counterpart. By using type 1 hypervisors, data-center managers can control and manage the operating systems and applications in new and sophisticated ways. An important benefit is the ability to consolidate more operating systems and applications onto fewer systems. For example, rather than having ten systems running at 10 percent utilization each, a data center might have one server manage the entire load. If utilization increases, guests and their applications can be moved to less-loaded systems live, without interruption of service. Using snapshots and cloning, the system can save the states of guests and duplicate those states—a much easier task than restoring from backups or installing manually or via scripts and tools. The price of this increased manageability is the cost of the VMM (if it is a commercial product), the need to learn new management tools and methods, and the increased complexity. Another type of type 1 hypervisor includes various general-purpose operating systems with VMM functionality. Here, an operating system such as RedHat Enterprise Linux, Windows, or Oracle Solaris performs its normal duties as well as providing a VMM allowing other operating systems to run as guests. Because of their extra duties, these hypervisors typically provide fewer virtualization features than other type 1 hypervisors. In many ways, they treat a guest operating system as just another process, but they provide special handling when the guest tries to execute special instructions.

      Type 1 hypervisors, also known as "bare-metal" hypervisors, function as specialized operating systems that manage guest VMs directly on hardware. They run in kernel mode, handling CPU scheduling, memory management, and I/O operations while offering APIs for external tools. They improve data-center efficiency by consolidating workloads, enabling live migration, and simplifying backup and replication. However, their complexity and cost, especially for commercial options like VMware ESX, pose challenges. Some general-purpose operating systems, such as Red Hat Enterprise Linux, incorporate Type 1 hypervisors, treating VMs like processes but providing special execution handling.

    5. 18.2 History Virtual machines first appeared commercially on IBM mainframes in 1972. Virtualization was provided by the IBM VM operating system. This system has evolved and is still available. In addition, many of its original concepts are found in other systems, making it worth exploring. IBM VM/370 divided a mainframe into multiple virtual machines, each running its own operating system. A major difficulty with the VM approach involved disk systems. Suppose that the physical machine had three disk drives but wanted to support seven virtual machines. Clearly, it could not allocate a disk drive to each virtual machine. The solution was to provide virtual disks—termed minidisks in IBM's VM operating system. The minidisks were identical to the system's hard disks in all respects except size. The system implemented each minidisk by allocating as many tracks on the physical disks as the minidisk needed. Once the virtual machines were created, users could run any of the operating systems or software packages that were available on the underlying machine. For the IBM VM system, a user normally ran CMS—a single-user interactive operating system. For many years after IBM introduced this technology, virtualization remained in its domain. Most systems could not support virtualization. However, a formal definition of virtualization helped to establish system requirements and a target for functionality. The virtualization requirements called for: Fidelity. A VMM provides an environment for programs that is essentially identical to the original machine. Performance. Programs running within that environment show only minor performance decreases. Safety. The VMM is in complete control of system resources. These requirements still guide virtualization efforts today. By the late 1990s, Intel 80x86 CPUs had become common, fast, and rich in features. Accordingly, developers launched multiple efforts to implement virtualization on that platform. Both Xen and VMware created technologies, still used today, to allow guest operating systems to run on the 80x86. Since that time, virtualization has expanded to include all common CPUs, many commercial and open-source tools, and many operating systems. For example, the open-source VirtualBox project (http://www.virtualbox.org) provides a program that runs on Intel x86 and AMD 64 CPUs and on Windows, Linux, macOS, and Solaris host operating systems. Possible guest operating systems include many versions of Windows, Linux, Solaris, and BSD, including even MS-DOS and IBM OS/2.

      Virtual machines first became commercially available with IBM VM/370 in 1972, enabling mainframe partitioning into multiple VMs. A major challenge was disk allocation, which IBM resolved using minidisks, allowing virtual machines to have dedicated yet flexible storage. Initially, virtualization was exclusive to IBM, but broader adoption emerged when Intel 80x86 CPUs advanced in the 1990s. Technologies like Xen and VMware enabled virtualization on commodity hardware, making it mainstream. Today, projects like VirtualBox support multiple architectures and operating systems. The core principles—fidelity, performance, and safety—continue guiding virtualization developments, reinforcing its role in modern computing.

    6. Take a moment to note that with virtualization, the definition of “operating system” once again blurs. For example, consider VMM software such as VMware ESX. This virtualization software is installed on the hardware, runs when the hardware boots, and provides services to applications. The services include traditional ones, such as scheduling and memory management, along with new types, such as migration of applications between systems. Furthermore, the applications are, in fact, guest operating systems. Is the VMware ESX VMM an operating system that, in turn, runs other operating systems? Certainly it acts like an operating system. For clarity, however, we call the component that provides virtual environments a VMM. The implementation of VMMs varies greatly. Options include the following: Hardware-based solutions that provide support for virtual machine creation and management via firmware. These VMMs, which are commonly found in mainframe and large to midsized servers, are generally known as type 0 hypervisors. IBM LPARs and Oracle LDOMs are examples. INDIRECTION “All problems in computer science can be solved by another level of indirection”—David Wheeler “… except for the problem of too many layers of indirection.”—Kevlin Henney Operating-system-like software built to provide virtualization, including VMware ESX (mentioned above), Joyent SmartOS, and Citrix XenServer. These VMMs are known as type 1 hypervisors. General-purpose operating systems that provide standard functions as well as VMM functions, including Microsoft Windows Server with HyperV and Red Hat Linux with the KVM feature. Because such systems have a feature set similar to type 1 hypervisors, they are also known as type 1. Applications that run on standard operating systems but provide VMM features to guest operating systems. These applications, which include VMware Workstation and Fusion, Parallels Desktop, and Oracle VirtualBox, are type 2 hypervisors. Paravirtualization, a technique in which the guest operating system is modified to work in cooperation with the VMM to optimize performance. Programming-environment virtualization, in which VMMs do not virtualize real hardware but instead create an optimized virtual system. This technique is used by Oracle Java and Microsoft.Net. Emulators that allow applications written for one hardware environment to run on a very different hardware environment, such as a different type of CPU. Application containment, which is not virtualization at all but rather provides virtualization-like features by segregating applications from the operating system. Oracle Solaris Zones, BSD Jails, and IBM AIX WPARs “contain” applications, making them more secure and manageable. The variety of virtualization techniques in use today is a testament to the breadth, depth, and importance of virtualization in modern computing. Virtualization is invaluable for data-center operations, efficient application development, and software testing, among many other uses.

      There are multiple types of VMMs, categorized based on how they integrate with hardware and software:

      Type 0 Hypervisors: Implemented in firmware, commonly used in enterprise servers (e.g., IBM LPARs, Oracle LDOMs). Type 1 Hypervisors: Installed directly on hardware, functioning as an OS substitute (e.g., VMware ESX, Citrix XenServer, Joyent SmartOS). Type 2 Hypervisors: Run as applications on an existing OS (e.g., VMware Workstation, Parallels, VirtualBox). Paravirtualization: Guest OS is modified for optimized interaction with the hypervisor. Programming-Environment Virtualization: Uses a custom virtual system instead of real hardware (e.g., Java Virtual Machine (JVM), Microsoft .NET). Emulators: Allow applications for one architecture to run on a different CPU type. Application Containment: Not true virtualization but isolates applications for security and management (e.g., Solaris Zones, BSD Jails).

    7. Even after the basic file-system algorithms have been selected, we can still improve performance in several ways. As was discussed in Chapter 12, storage device controllers include local memory to form an on-board cache that is large enough to store entire tracks or blocks at a time. On an HDD, once a seek is performed, the track is read into the disk cache starting at the sector under the disk head (reducing latency time). The disk controller then transfers any sector requests to the operating system. Once blocks make it from the disk controller into main memory, the operating system may cache the blocks there. Some systems maintain a separate section of main memory for a buffer cache, where blocks are kept under the assumption that they will be used again shortly. Other systems cache file data using a page cache. The page cache uses virtual memory techniques to cache file data as pages rather than as file-system-oriented blocks. Caching file data using virtual addresses is far more efficient than caching through physical disk blocks, as accesses interface with virtual memory rather than the file system. Several systems—including Solaris, Linux, and Windows—use page caching to cache both process pages and file data. This is known as unified virtual memory.

      To improve file system performance, storage devices use on-board caches to store tracks or blocks, reducing delays. The operating system also caches blocks in main memory for faster access. Some systems use a buffer cache for frequently used blocks, while others use a page cache that stores file data as virtual memory pages, making access more efficient. Systems like Solaris, Linux, and Windows use unified virtual memory to cache both process pages and file data.

    8. Some operating systems, including UNIX, treat a directory exactly the same as a file—one with a “type” field indicating that it is a directory. Other operating systems, including Windows, implement separate system calls for files and directories and treat directories as entities separate from files. Whatever the larger structural issues, the logical file system can call the file-organization module to map the directory I/O into storage block locations, which are passed on to the basic file system and I/O control system.

      Some operating systems, like UNIX, treat directories as special types of files, while others, like Windows, handle files and directories separately with different system calls. Regardless of the approach, the logical file system maps directory operations to storage blocks, which are then managed by the basic file system and I/O control system.

    9. 14.6.2 Performance Even after the basic file-system algorithms have been selected, we can still improve performance in several ways. As was discussed in Chapter 12, storage device controllers include local memory to form an on-board cache that is large enough to store entire tracks or blocks at a time. On an HDD, once a seek is performed, the track is read into the disk cache starting at the sector under the disk head (reducing latency time). The disk controller then transfers any sector requests to the operating system. Once blocks make it from the disk controller into main memory, the operating system may cache the blocks there. Some systems maintain a separate section of main memory for a buffer cache, where blocks are kept under the assumption that they will be used again shortly. Other systems cache file data using a page cache. The page cache uses virtual memory techniques to cache file data as pages rather than as file-system-oriented blocks. Caching file data using virtual addresses is far more efficient than caching through physical disk blocks, as accesses interface with virtual memory rather than the file system. Several systems—including Solaris, Linux, and Windows—use page caching to cache both process pages and file data. This is known as unified virtual memory. Some versions of UNIX and Linux provide a unified buffer cache. To illustrate the benefits of the unified buffer cache, consider the two alternatives for opening and accessing a file. One approach is to use memory mapping (Section 13.5); the second is to use the standard system calls read() and write(). Without a unified buffer cache, we have a situation similar to Figure 14.10. Here, the read() and write() system calls go through the buffer cache. The memory-mapping call, however, requires using two caches—the page cache and the buffer cache. A memory mapping proceeds by reading in disk blocks from the file system and storing them in the buffer cache. Because the virtual memory system does not interfa

      Performance optimization involves caching strategies, memory management, and efficient write operations. Modern storage controllers use on-board caches, while operating systems employ buffer caches or page caches to reduce I/O latency. Unified caches prevent redundant data storage, minimizing memory waste and CPU overhead. Solaris and Linux leverage unified virtual memory for efficiency. Least Recently Used (LRU) algorithms manage cache replacement, but their implementation varies. Asynchronous writes improve speed by allowing immediate process continuation, whereas synchronous writes ensure data integrity. Balancing caching, write strategies, and metadata updates is crucial for maximizing storage performance while maintaining reliability.

    10. File types also can be used to indicate the internal structure of the file. Source and object files have structures that match the expectations of the programs that read them. Further, certain files must conform to a required structure that is understood by the operating system. For example, the operating system requires that an executable file have a specific structure so that it can determine where in memory to load the file and what the location of the first instruction is. Some operating systems extend this idea into a set of system-supported file structures, with sets of special operations for manipulating files with those structures. This point brings us to one of the disadvantages of having the operating system support multiple file structures: it makes the operating system large and cumbersome. If the operating system defines five different file structures, it needs to contain the code to support these file structures. In addition, it may be necessary to define every file as one of the file types supported by the operating system. When new applications require information structured in ways not supported by the operating system, severe problems may result. For example, assume that a system supports two types of files: text files (composed of ASCII characters separated by a carriage return and line feed) and executable binary files. Now, if we (as users) want to define an encrypted file to protect the contents from being read by unauthorized people, we may find neither file type to be appropriate. The encrypted file is not ASCII text lines but rather is (apparently) random bits. Although it may appear to be a binary file, it is not executable. As a result, we may have to circumvent or misuse the operating system's file-type mechanism or abandon our encryption scheme. Some operating systems impose (and support) a minimal number of file structures. This approach has been adopted in UNIX, Windows, and others. UNIX considers each file to be a sequence of 8-bit bytes; no interpretation of these bits is made by the operating system. This scheme provides maximum flexibility but little support. Each application program must include its own code to interpret an input file as to the appropriate structure. However, all operating systems must support at least one structure—that of an executable file—so that the system is able to load and run programs.

      File types define their structure, which some operating systems enforce for proper execution. While supporting multiple file structures adds flexibility, it can make the system complex, so many OSs, like UNIX, treat files as simple byte sequences, leaving interpretation to applications.

    11. File locks provide functionality similar to reader–writer locks, covered in Section 7.1.2. A shared lock is akin to a reader lock in that several processes can acquire the lock concurrently. An exclusive lock behaves like a writer lock; only one process at a time can acquire such a lock. It is important to note that not all operating systems provide both types of locks: some systems provide only exclusive file locking. Furthermore, operating systems may provide either mandatory or advisory file-locking mechanisms. With mandatory locking, once a process acquires an exclusive lock, the operating system will prevent any other process from accessing the locked file. For example, assume a process acquires an exclusive lock on the file system.log. If we attempt to open system.log from another process—for example, a text editor—the operating system will prevent access until the exclusive lock is released. Alternatively, if the lock is advisory, then the operating system will not prevent the text editor from acquiring access to system.log. Rather, the text editor must be written so that it manually acquires the lock before accessing the file. In other words, if the locking scheme is mandatory, the operating system ensures locking integrity. For advisory locking, it is up to software developers to ensure that locks are appropriately acquired and released. As a general rule, Windows operating systems adopt mandatory locking, and UNIX systems employ advisory locks.

      The example of file locking in Java demonstrates how developers can manage concurrent file access. Java’s FileChannel class allows acquiring locks on specific portions of a file, supporting both shared and exclusive locks. This is particularly useful in applications that involve collaborative editing or log file management. However, improper use of file locks can lead to deadlocks, where two processes indefinitely wait for each other to release a lock. This example reinforces the importance of synchronization in file handling. Additionally, understanding how different operating systems implement file locking (such as Windows using mandatory locks and UNIX using advisory locks) can help developers write cross-platform applications that handle concurrency efficiently.

    12. 13.1.4 File Structure File types also can be used to indicate the internal structure of the file. Source and object files have structures that match the expectations of the programs that read them. Further, certain files must conform to a required structure that is understood by the operating system. For example, the operating system requires that an executable file have a specific structure so that it can determine where in memory to load the file and what the location of the first instruction is. Some operating systems extend this idea into a set of system-supported file structures, with sets of special operations for manipulating files with those structures. This point brings us to one of the disadvantages of having the operating system support multiple file structures: it makes the operating system large and cumbersome. If the operating system defines five different file structures, it needs to contain the code to support these file structures. In addition, it may be necessary to define every file as one of the file types supported by the operating system. When new applications require information structured in ways not supported by the operating system, severe problems may result. For example, assume that a system supports two types of files: text files (composed of ASCII characters separated by a carriage return and line feed) and executable binary files. Now, if we (as users) want to define an encrypted file to protect the contents from being read by unauthorized people, we may find neither file type to be appropriate. The encrypted file is not ASCII text lines but rather is (apparently) random bits. Although it may appear to be a binary file, it is not executable. As a result, we may have to circumvent or misuse the operating system's file-type mechanism or abandon our encryption scheme. Some operating systems impose (and support) a minimal number of file structures. This approach has been adopted in UNIX, Windows, and others. UNIX considers each file to be a sequence of 8-bit bytes; no interpretation of these bits is made by the operating system. This scheme provides maximum flexibility but little support. Each application program must include its own code to interpret an input file as to the appropriate structure. However, all operating systems must support at least one structure—that of an executable file—so that the system is able to load and run programs.

      File structures help the operating system and applications organize and manage data effectively. Some files, like source and object files, follow a defined structure expected by the programs that use them. Operating systems often impose structures on certain files, such as executable files, to facilitate their loading and execution. However, supporting multiple file structures increases system complexity and can restrict flexibility. Some systems, like UNIX, adopt a minimal approach, treating all files as byte sequences without enforcing structure. This approach enhances flexibility but places the burden on applications to interpret data. A rigid file structure system can create issues when new types of files, such as encrypted files, do not conform to predefined formats, requiring workarounds or alternative storage methods.

    13. Access Control The most common approach to the protection problem is to make access dependent on the identity of the user. Different users may need different types of access to a file or directory. The most general scheme to implement identity-dependent access is to associate with each file and directory an access-control list (ACL) specifying user names and the types of access allowed for each user. When a user requests access to a particular file, the operating system checks the access list associated with that file. If that user is listed for the requested access, the access is allowed. Otherwise, a protection violation occurs, and the user job is denied access to the file. This approach has the advantage of enabling complex access methodologies. The main problem with access lists is their length. If we want to allow everyone to read a file, we must list all users with read access. This technique has two undesirable consequences: Constructing such a list may be a tedious and unrewarding task, especially if we do not know in advance the list of users in the system. The directory entry, previously of fixed size, now must be of variable size, resulting in more complicated space management. These problems can be resolved by use of a condensed version of the access list. To condense the length of the access-control list, many systems recognize three classifications of users in connection with each file: Owner. The user who created the file is the owner. Group. A set of users who are sharing the file and need similar access is a group, or work group. Other. All other users in the system. The most common recent approach is to combine access-control lists with the more general (and easier to implement) owner, group, and universe access-control scheme just described. For example, Solaris uses the three categories of access by default but allows access-control lists to be added to specific files and directories when more fine-grained access control is desired. To illustrate, consider a person, Sara, who is writing a new book. She has hired three graduate students (Jim, Dawn, and Jill) to help with the project. The text of the book is kept in a file named book.tex. The protection associated with this file is as follows: Sara should be able to invoke all operations on the file. Jim, Dawn, and Jill should be able only to read and write the file; they should not be allowed to delete the file. All other users should be able to read, but not write, the file. (Sara is interested in letting as many people as possible read the text so that she can obtain feedback.)

      Unlike Unix, Windows manages ACLs through a graphical user interface, making it more user-friendly. The section highlights an example from Windows 7 (NTFS file system) where a user named "guest" is explicitly denied access to a file. Windows permissions allow detailed configurations, such as granting or restricting access based on individual users or groups. This flexibility is useful in corporate environments where different departments require different levels of access. However, managing ACLs in Windows can be complex due to overlapping permissions and precedence rules. The section also discusses how operating systems prioritize permissions when conflicts arise, with Solaris and other Unix-like systems granting precedence to ACLs over standard permissions. This approach ensures that more specific rules override general access settings.

    14. Shared Memory in the Windows API

      Shared Memory in the Windows API Windows provides built-in support for memory-mapped files to enable shared memory between processes. The process begins with the creation of a file mapping using CreateFile() and CreateFileMapping(), followed by mapping a view of the file into a process’s address space with MapViewOfFile(). A second process can then access the same memory-mapped file, allowing seamless data sharing. This mechanism is particularly useful for producer-consumer scenarios, where one process writes to shared memory while another reads from it. The example code in this section demonstrates how a producer writes a message to a shared-memory object, which a consumer then reads. This implementation avoids the overhead of traditional IPC methods like pipes or message queues, leveraging efficient memory management instead.

    1. .

      Just to be clear, I would change it to the below (or something similar) just in case people think they only need to do it on the control machine.

      During installation, it is also recommended to add the Nuke Stage Listener shortcut to your Windows startup app on all machines used for Nuke Stage.

    1. Reviewer #1 (Public review):

      Summary:

      The authors of this study set out to find RNA binding proteins in the CNS in cell-type specific sequencing data and discover that the cardiomyopathy-associated protein RBM20 is selectively expressed in olfactory bulb glutamatergic neurons and PV+ GABAergic neurons. They make an HA-tagged RBM20 allele to perform CLIP-seq to identify RBM20 binding sites and find direct targets of RBM20 in olfactory bulb glutmatergic neurons. In these neurons, RBM20 binds intronic regions. RBM20 has previously been implicated in splicing, but when they selectively knockout RBM20 in glutamatergic neurons they do not see changes in splicing, but they do see changes in RNA abundance, especially of long genes with many introns, which are enriched for synapse-associated functions. These data show that RBM20 has important functions in gene regulation in neurons, which was previously unknown, and they suggest it acts through a mechanism distinct from what has been studied before in cardiomyocytes.

      Strengths:

      The study finds expression of the cardiomyopathy-associated RNA binding protein RBM20 in specific neurons in the brain, opening new windows into its potential functions there.

      The study uses CLIP-seq to identify RBM20 binding RNAs in olfactory bulb neurons.

      Conditional knockout of RBM20 in glutamatergic or PV neurons allows the authors to detect mRNA expression that is regulated by RBM20.

      The data include substantial controls and quality control information to support the rigor of the findings.

      Weaknesses:

      The authors do not fully identify the mechanism by which RBM20 acts to regulate RNA expression in neurons, though they do provide data suggesting that neuronal RBM20 does not regulate alternate splicing in neurons, which is an interesting contrast to its proposed mechanism of function in cardiomyocytes. Discovery of the RNA regulatory functions of RBM20 in neurons is left as a question for future studies.

      The study does not identify functional consequences of the RNA changes in the conditional knockout cells, so this is also a question for the future.

    1. Reviewer #2 (Public review):

      van Vliet and colleagues present results of a study correlating internal states of a convolutional neural network trained on visual word stimuli with evoked MEG potentials during reading.

      In this study, a standard deep learning image recognition model (VGG-11) trained on a large natural image set (ImageNet) that begins illiterate but is then further trained on visual word stimuli, is used on a set of predefined stimulus images to extract strings of characters from "noisy" words, pseudowords and real words. This methodology is used in hopes of creating a model which learns to apply the same nonlinear transforms that could be happening in different regions of the brain - which would be validated by studying the correlations between the weights of this model and neural responses. Specifically, the aim is that the model learns some vector embedding space, as quantified by the spread of activations across a layer's weights (L2 Norm prior to ReLu Activation Function), for the different kinds of stimuli, that creates a parameterized decision boundary that is similar to amplitude changes at different times for a MEG signal. More importantly, the way that the stimuli are ordered or ranked in that space should be separable to the degree we see separation in neural activity. This study does show that the layer weights corresponding to five different broad classes of stimuli do statistically correlate with three specific components in the ERP. However, I believe there are fundamental theoretical issues that limit the implications of the results of this study.

      As has been shown over many decades, there are many potential computational algorithms, with varied model architectures, that can perform the task of text recognition from an image. However, there is no evidence presented here that this particular algorithm has comparable performance to human behavior (i.e. similar accuracy with a comparable pattern of mistakes). This is a fundamental prerequisite before attempting to meaningfully correlate these layer activations to human neural activations. Therefore, it is unlikely that correlating these derived layer weights to neural activity provides meaningful novel insights into neural computation beyond what is seen using traditional experimental methods.

      One example of a substantial discrepancy between this model and neural activations is that, while incorporating frequency weighting into the training data is shown to slightly increase neural correlation with the model, Figure 7 shows that no layer of the model appears directly sensitive to word frequency. This is in stark contrast to the strong neural sensitivity to word frequency seen in EEG (e.g. Dambacher et al 2006 Brain Research), fMRI (e.g. Kronbichler et al 2004 NeuroImage), MEG (e.g. Huizeling et al 2021 Neurobio. Lang.), and intracranial (e.g. Woolnough et al 2022 J. Neurosci.) recordings. Figure 7 also demonstrates that late stages of the model show a strong negative correlation with font size, whereas later stages of neural visual word processing are typically insensitive to differences in visual features, instead showing sensitivity to lexical factors.

      Another example of the mismatch between this model and visual cortex is the lack of feedback connections in the model. Within visual cortex there are extensive feedback connections, with later processing stages providing recursive feedback to earlier stages. This is especially evident in reading, where feedback from lexical level processes feeds back to letter level processes (e.g. Heilbron et al 2020 Nature Comms.). This feedback is especially relevant for reading of words in noisy conditions, as tested in the current manuscript, as lexical knowledge enhances letter representation in visual cortex (the word superiority effect). This results in neural activity in multiple cortical areas varying over time, changing selectivity within a region at different measured time points (e.g. Woolnough et al 2021 Nature Human Behav.), which in the current study is simplified down to three discrete time windows, each attributed to different spatial locations.

      The presented model needs substantial further development to be able to replicate, both behaviorally and neurally, many of the well-characterized phenomena seen in human behavior and neural recordings that are fundamental hallmarks of human visual word processing. Until that point it is unclear what novel contributions can be gleaned from correlating low dimensional model weights from these computational models with human neural data.

      The revised version of this manuscript has not addressed these concerns.

    2. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their efforts. They have pointed out several shortcomings and made very helpful suggestions. Based on their feedback, we have substantially revised the manuscript and feel the paper has been much improved because of it.

      Notable changes are:

      (1) As our model does not contain feed-back connections, the focus of the study is now more clearly communicated to be on feed-forward processes only, with appropriate justifications for this choice added to the Introduction and Discussion sections. Accordingly, the title has been changed to include the term “feed-forward”.

      (2) The old Figure 5 has been removed in favor of reporting correlation scores to the right of the response profiles in other figures.

      (3) We now discuss changes to the network architecture (new Figure 5) and fine-tuning of the hyperparameters (new Figure 6) in the main text instead of only the Supplementary Information.

      (4) The discussion on qualitative versus quantitative analysis has been extended and given its own subsection entitled “On the importance of experimental contrasts and qualitative analysis of the model”.

      Below, we address each point that the reviewers brought up in detail and outline what improvements we have made in the revision to address them.

      Reviewer #1 (Public Review):

      Summary:

      This study trained a CNN for visual word classification and supported a model that can explain key functional effects of the evoked MEG response during visual word recognition, providing an explicit computational account from detection and segmentation of letter shapes to final word-form identification.

      Strengths:

      This paper not only bridges an important gap in modeling visual word recognition, by establishing a direct link between computational processes and key findings in experimental neuroimaging studies, but also provides some conditions to enhance biological realism.

      Weaknesses:

      The interpretation of CNN results, especially the number of layers in the final model and its relationship with the processing of visual words in the human brain, needs to be further strengthened.

      We have experimented with the number of layers and the number of units in each layer. In the previous version of the manuscript, these results could be found in the supplementary information. For the revised version, we have brought some of these results into the main text and discuss them more thoroughly.

      We have added a figure (Figure 5 in the revised manuscript) showing the impact of the number of convolution and fully-connected layers on the response profiles of the layers, as well as the correlation with the three MEG components.

      We discuss the figure in the Results section as follows:

      “Various variations in model architecture and training procedure were evaluated. We found that the number of layers had a large impact on the response patterns produced by the model (Figure 5). The original VGG-11 architecture defines 5 convolution layers and 3 fully connected layers (including the output layer). Removing a convolution layer (Figure 5, top row), or removing one of the fully connected layers (Figure 5, second row), resulted in a model that did exhibit an enlarged response to noisy stimuli in the early layers that mimics the Type-I response. However, such models failed to show a sufficiently diminished response to noisy stimuli in the later layers, hence failing to produce responses that mimic the Type-II or N400m, a failure which also showed as low correlation scores.

      Adding an additional convolution layer (Figure 5, third row) resulted in a model where none of the layer response profiles mimics that of the Type-II response. The Type-II response is characterized by a reduced response to both noise and symbols, but an equally large response to consonant strings, real and pseudo words. However, in the model with an additional convolution layer, the consonant strings evoked a reduced response already in the first fully connected layer, which is a feature of the N400m rather than the Type-II. These kind of subtleties in the response pattern, which are important for the qualitative analysis, generally did not show quantitatively in the correlation scores, as the fully connected layers in this model correlate as well with the Type-II response as models that did show a response pattern that mimics the Type-II.

      Adding an additional fully connected layer (Figure 5, fourth row) resulted in a model with similar response profiles and correlation with the MEG components as the original VGG-11 architecture (Figure 5, bottom row) The N400m-like response profile is now observed in the third fully connected layer rather than the output layer. However, the decrease in response to consonant strings versus real and pseudo words, which is typical of the N400m, is less distinct than in the original VGG-11 architecture.”

      And in the Discussion section:

      “In the model, convolution units are followed by pooling units, which serve the purpose of stratifying the response across changes in position, size and rotation within the receptive field of the pooling unit. Hence, the effect of small differences in letter shape, such as the usage of different fonts, was only present in the early convolution layers, in line with findings in the EEG literature (Chauncey et al., 2008; Grainger & Holcomb, 2009; Hauk & Pulvermüller, 2004). However, the ability of pooling units to stratify such differences depends on the size of their receptive field, which is determined by the number of convolution-and-pooling layers. As a consequence, the response profiles of the subsequent fully connected layers was also very sensitive to the number of convolution-and-pooling layers. The optimal number of such layers is likely dependent on the input size and pooling strategy. Given the VGG-11 design of doubling the receptive field after each layer, combined with an input size of 225×225 pixels, the optimal number of convolution-andpooling layers for our model was five, or the model would struggle to produce response profiles mimicking those of the Type-II component in the subsequent fully connected layers (Figure 5).”

      Reviewer #1 (Recommendations For The Authors):

      (1) The similarity between CNNs and human MEG responses, including type-I (100ms), type-II (150ms), and N400 (400ms) components, looks like separately, lacking the sequential properties among these three components. Is the recurrent neural network (RNN), which can be trained to process and convert a sequential data input into a specific sequential data output, a better choice?

      When modeling sequential effects, meaning that the processing of the current word is influenced by the word that came before it, such as priming and top-down modulations, we agree that such a model would indeed require recurrency in its architecture. However, we feel that the focus of modeling efforts in reading has been overwhelmingly on the N400 and such priming effects, usually skipping over the pixel-to-letter process. So, for this paper, we were keen on exploring more basic effects such as noise and symbols versus letters on the type-I and type-II responses. And for these effects, a feed-forward model turns out to be sufficient, so we can keep the focus of this particular paper on bottom-up processes during single word reading, on which there is already a lot to say.

      To clarify our focus on feed-forward process, we have modified the title of the paper to be:

      “Convolutional networks can model the functional modulation of the MEG responses associated with feed-forward processes during visual word recognition” furthermore, we have revised the Introduction to highlight this choice, noting:

      “Another limitation is that these models have primarily focused on feed-back lexicosemantic effects while oversimplifying the initial feed-forward processing of the visual input.

      […]

      For this study, we chose to focus on modeling the early feed-forward processing occurring during visual word recognition, as the experimental setup in Vartiainen et al. (2011) was designed to demonstrate.

      […]

      By doing so, we restrict ourselves to an investigation of how well the three evoked components can be explained by a feed-forward CNN in an experimental setting designed to demonstrate feed-forward effects. As such, the goal is not to present a complete model of all aspects of reading, which should include feed-back effects, but rather to demonstrate the effectiveness of using a model that has a realistic form of input when the aim is to align the model with the evoked responses observed during visual word recognition.”

      And in the Discussion section:

      “In this paper we have restricted our simulations to feed-forward processes. Now, the way is open to incorporate convolution-and-pooling principles in models of reading that simulate feed-back processes as well, which should allow the model to capture more nuance in the Type-II and N400m components, as well as extend the simulation to encompass a realistic semantic representation.”

      (2) There is no clear relationship between the layers that signal needs to traverse in the model and the relative duration of the three components in the brain.

      While some models offer a tentative mapping between layers and locations in the brain, none of the models we are aware of actually simulate time accurately and our model is no exception.

      While we provide some evidence that the three MEG components are best modeled with different types of layers, and the type-I becomes somewhere before type-II and N400m is last in our model, the lack of timing information is a weakness of our model we have not been able to address. In our previous version, this already was the main topic of our “Limitations of the model” section, but since this weakness was pointed out by all reviewers, we have decided to widen our discussion of it:

      “One important limitation of the current model is the lack of an explicit mapping from the units inside its layers to specific locations in the brain at specific times. The temporal ordering of the components is simulated correctly, with the response profile matching that of the type-I occurring the layers before those matching the type-II, followed by the N400m. Furthermore, every component is best modeled by a different type of layer, with the type-I best described by convolution-and-pooling, the type-II by fully-connected linear layers and the N400m by a one-hot encoded layer. However, there is no clear relationship between the number of layers the signal needs to traverse in the model to the processing time in the brain. Even if one considers that the operations performed by the initial two convolution layers happen in the retina rather than the brain, the signal needs to propagate through three more convolution layers to reach the point where it matches the type-II component at 140-200 ms, but only through one more additional layer to reach the point where it starts to match the N400m component at 300-500 ms. Still, cutting down on the number of times convolution is performed in the model seems to make it unable to achieve the desired suppression of noise (Figure 5). It also raises the question what the brain is doing during the time between the type-II and N400m component that seems to take so long. It is possible that the timings of the MEG components are not indicative solely of when the feed-forward signal first reaches a certain location, but are rather dictated by the resolution of feed-forward and feedback signals (Nour Eddine et al., 2024).”

      See also our response to the next comment of the Reviewer, in which we dive more into the effect of the number of layers, which could be seen as a manipulation of time.

      (3) I am impressed by the CNN that authors modified to match the human brain pattern for the visual word recognition process, by the increase and decrease of the number of layers. The result of this part was a little different from the author’s expectation; however, the author didn’t explain or address this issue.

      We are glad to hear that the reviewer found these results interesting. Accordingly, we now discuss these results more thoroughly in the main text.

      We have moved the figure from the supplementary information to the main text (Figure 5 in the revised manuscript). And describe the results in the Results section:

      “Various variations in model architecture and training procedure were evaluated. We found that the number of layers had a large impact on the response patterns produced by the model (Figure 5). The original VGG-11 architecture defines 5 convolution layers and 3 fully connected layers (including the output layer). Removing a convolution layer (Figure 5, top row), or removing one of the fully connected layers (Figure 5, second row), resulted in a model that did exhibit an enlarged response to noisy stimuli in the early layers that mimics the Type-I response. However, such models failed to show a sufficiently diminished response to noisy stimuli in the later layers, hence failing to produce responses that mimic the Type-II or N400m, a failure which also showed as low correlation scores.

      Adding an additional convolution layer (Figure 5, third row) resulted in a model where none of the layer response profiles mimics that of the Type-II response. The Type-II response is characterized by a reduced response to both noise and symbols, but an equally large response to consonant strings, real and pseudo words. However, in the model with an additional convolution layer, the consonant strings evoked a reduced response already in the first fully connected layer, which is a feature of the N400m rather than the Type-II. These kind of subtleties in the response pattern, which are important for the qualitative analysis, generally did not show quantitatively in the correlation scores, as the fully connected layers in this model correlate as well with the Type-II response as models that did show a response pattern that mimics the Type-II.

      Adding an additional fully connected layer (Figure 5, fourth row) resulted in a model with similar response profiles and correlation with the MEG components as the original VGG-11 architecture (Figure 5, bottom row) The N400m-like response profile is now observed in the third fully connected layer rather than the output layer. However, the decrease in response to consonant strings versus real and pseudo words, which is typical of the N400m, is less distinct than in the original VGG-11 architecture.”

      We also incorporated these results in the Discussion:

      “However, the ability of pooling units to stratify such differences depends on the size of their receptive field, which is determined by the number of convolution-andpooling layers. This might also explain why, in later layers, we observed a decreased response to stimuli where text was rendered with a font size exceeding the receptive field of the pooling units (Figure 8). Hence, the response profiles of the subsequent fully connected layers was very sensitive to the number of convolution-and-pooling layers. This number is probably dependent on the input size and pooling strategy. Given the VGG11 design of doubling the receptive field after each layer, combined with an input size of 225x225 pixels, the optimal number of convolution-and-pooling layers for our model was five, or the model would struggle to produce response profiles mimicking those of the type-II component in the subsequent fully connected layers (Figure 5).

      […]

      A minimum of two fully connected layers was needed to achieve this in our case, and adding more fully connected layers would make them behave more like the component (Figure 5).”

      (4) Can the author explain why the number of layers in the final model is optimal by benchmarking the brain hierarchy?

      We have incorporated the figure describing the correlation between each model and the MEG components (previously Figure 5) with the figures describing the response profiles (Figures 4 and 5 in the revised manuscript and Supplementary Figures 2-6). This way, we (and the reader) can now benchmark every model qualitatively and quantitatively.

      As we stated in our response to the previous comment, we have added a more thorough discussion on the number of layers, which includes the justification for our choice for the final model. The benchmark we used was primarily whether the model shows the same response patterns as the Type I, Type II and N400 responses, which disqualifies all models with fewer than 5 convolution and 3 fully connected layers. Models with more layers also show the proper response patterns, however we see that there is actually very little difference in the correlation scores between different models. Hence, our justification for sticking with the original VGG11 architecture is that it produces the qualitative best response profiles, while having roughly the same (decently high) correlation with the MEG components. Furthermore, by sticking to the standard architecture, we make it slightly easier to replicate our results as one can use readily available pre-trained ImageNet weights.

      As well as always discussing the correlation scores in tandem with the qualitative analysis, we have added the following statement to the Results:

      “Based on our qualitative and quantitative analysis, the model variant that performed best overall was the model that had the original VGG11 architecture and was preinitialized from earlier training on ImageNet, as depicted in the bottom rows of Figure 4 and Figure 5.”

      Reviewer #2 (Public Review):

      As has been shown over many decades, many potential computational algorithms, with varied model architectures, can perform the task of text recognition from an image. However, there is no evidence presented here that this particular algorithm has comparable performance to human behavior (i.e. similar accuracy with a comparable pattern of mistakes). This is a fundamental prerequisite before attempting to meaningfully correlate these layer activations to human neural activations. Therefore, it is unlikely that correlating these derived layer weights to neural activity provides meaningful novel insights into neural computation beyond what is seen using traditional experimental methods.

      We very much agree with the reviewer that a qualitative analysis of whether the model can explain experimental effects needs to happen before a quantitative analysis, such as evaluating model-brain correlation scores. In fact, this is one of the intended key points we wished to make.

      As we discuss at length in the Introduction, “traditional” models of reading (those that do not rely on deep learning) are not able to recognize a word regardless of exact letter shape, size, and (up to a point) rotation. In this study, our focus is on these low-level visual tasks rather than high-level tasks concerning semantics. As the Reviewer correctly states, there are many potential computational algorithms able to perform these visual task at a human level and so we need to evaluate the model not only on its ability to mimic human accuracy but also on generating a comparable pattern of mistakes. In our case, we need a pattern of behavior that is indicative of the visual processes at the beginning of the reading pipeline. Hence, rather than relying on behavioral responses that are produced at the very end, we chose the evaluate the model based on three MEG components that provide “snapshots” of the reading process at various stages. These components are known to manifest a distinct pattern of “behavior” in the way they respond to different experimental conditions (Figure 2), akin to what to Reviewer refers to as a “pattern of mistakes”. The model was first evaluated on its ability to replicate the behavior of the MEG components in a qualitative manner (Figure 4). Only then do we move on to a quantitative correlation analysis. In this manner, we feel we are in agreement with the approach advocated by the Reviewer.

      In the Introduction, we now clarify:

      “Another limitation is that these models have primarily focused on feed-back lexicosemantic effects while oversimplifying the initial feed-forward processing of the visual input.

      […]

      We sought to construct a model that is able to recognize words regardless of length, size, typeface and rotation, as well as humans can, so essentially perfectly, whilst producing activity that mimics the type-I, type-II, and N400m components which serve as snapshots of this process unfolding in the brain.

      […]

      These variations were first evaluated on their ability to replicate the experimental effects in that study, namely that the type-I response is larger for noise embedded words than all other stimuli, the type-II response is larger for all letter strings than symbols, and that the N400m is larger for real and pseudowords than consonant strings. Once a variation was found that could reproduce these effects satisfactorily, it was further evaluated based on the correlation between the amount of activation of the units in the model and MEG response amplitude.”

      To make this prerequisite more clear, we have removed what was previously Figure 5, which showed the correlation between the various models the MEG components out of the context of their response patterns. Instead, these correlation values are now always presented next to the response patterns (Figures 4 and 5, and Supplementary Figures 2-6 in the revised manuscript). This invites the reader to always consider these metrics in relation to one another.

      One example of a substantial discrepancy between this model and neural activations is that, while incorporating frequency weighting into the training data is shown to slightly increase neural correlation with the model, Figure 7 shows that no layer of the model appears directly sensitive to word frequency. This is in stark contrast to the strong neural sensitivity to word frequency seen in EEG (e.g. Dambacher et al 2006 Brain Research), fMRI (e.g. Kronbichler et al 2004 NeuroImage), MEG (e.g. Huizeling et al 2021 Neurobio. Lang.), and intracranial (e.g. Woolnough et al 2022 J. Neurosci.) recordings. Figure 7 also demonstrates that the late stages of the model show a strong negative correlation with font size, whereas later stages of neural visual word processing are typically insensitive to differences in visual features, instead showing sensitivity to lexical factors.

      We are glad the reviewer brought up the topic of frequency balancing, as it is a good example of the importance of the qualitative analysis. Frequency balancing during training only had a moderate impact on correlation scores and from that point of view does not seem impactful. However, when we look at the qualitative evaluation, we see that with a large vocabulary, a model without frequency balancing fails to properly distinguish between consonant strings and (pseudo)words (Figure 4, 5th row). Hence, from the point of view of being able to reproduce experimental effects, frequency balancing had a large impact. We now discuss this more explicitly in the revised Discussion section:

      “Overall, we found that a qualitative evaluation of the response profiles was more helpful than correlation scores. Often, a deficit in the response profile of a layer that would cause a decrease in correlation on one condition would be masked by an increased correlation in another condition. A notable example is the necessity for frequency-balancing the training data when building models with a vocabulary of 10 000. Going by correlation score alone, there does not seem to be much difference between the model trained with and without frequency balancing (Figure 4A, fifth row versus bottom row). However, without frequency balancing, we found that the model did not show a response profile where consonant strings were distinguished from words and pseudowords (Figure 4A, fifth row), which is an important behavioral trait that sets the N400m component apart from the Type-II component (Figure 2D). This underlines the importance of the qualitative evaluation in this study, which was only possible because of a straightforward link between the activity simulated within a model to measurements obtained from the brain, combined with the presence of clear experimental conditions.”

      It is true that the model, even with frequency balancing, only captures letter- and bigramfrequency effects and not the word-frequency effects that we know the N400m is sensitive to. Since our model is restricted to feed-forward processes, this finding adds to the evidence that frequency-modulated effects are driven by feed-back effects as modeled by Nour Eddine et al. (2024, doi:10.1016/j.cognition.2024.105755). See also our response to the next comment by the Reviewer where we discuss feed-back connections. We have added the following to the section about model limitations in the revised Discussion:

      “The fact that the model failed to simulate the effects of word-frequency on the N400m (Figure 8), even after frequency-balancing of the training data, is additional evidence that this effect may be driven by feed-back activity, as for example modeled by Nour Eddine et al. (2024).”

      Like the Reviewer, we initially thought that later stages of neural visual word processing would be insensitive to differences in font size. When diving into the literature to find support for this claim, we found only a few works directly studying the effect of font size on evoked responses, but, surprisingly, what we did find seemed to align with our model. We have added the following to our revised Discussion:

      “The fully connected linear layers in the model show a negative correlation with font size. While the N400 has been shown to be unaffected by font size during repetition priming (Chauncey et al., 2008), it has been shown that in the absence of priming, larger font sizes decrease the evoked activity in the 300–500 ms window (Bayer et al., 2012; Schindler et al., 2018). Those studies refer to the activity within this time window, which seems to encompass the N400, as early posterior negativity (EPN). What possibly happens in the model is that an increase in font size causes an initial stronger activation in the first layers, due to more convolution units receiving input. This leads to a better signal-to-noise ratio (SNR) later on, as the noise added to the activation of the units remains constant whilst the amplitude of the input signal increases. A better SNR translates ultimately in less co-activation of units corresponding to orthographic neighbours in the final layers, hence to a decrease in overall layer activity.”

      Another example of the mismatch between this model and the visual cortex is the lack of feedback connections in the model. Within the visual cortex, there are extensive feedback connections, with later processing stages providing recursive feedback to earlier stages. This is especially evident in reading, where feedback from lexical-level processes feeds back to letter-level processes (e.g. Heilbron et al 2020 Nature Comms.). This feedback is especially relevant for the reading of words in noisy conditions, as tested in the current manuscript, as lexical knowledge enhances letter representation in the visual cortex (the word superiority effect). This results in neural activity in multiple cortical areas varying over time, changing selectivity within a region at different measured time points (e.g. Woolnough et al 2021 Nature Human Behav.), which in the current study is simplified down to three discrete time windows, each attributed to different spatial locations.

      We agree with the Reviewer that a full model of reading in the brain must include feed-back connections and share their sentiment that these feed-back processes play an important role and are a fascinating topic to study. The intent for the model presented in our study is very much to be a stepping stone towards extending the capabilities of models that do include such connections.

      However, there is a problem of scale that cannot be ignored.

      Current models of reading that do include feedback connections fall into the category we refer to in the paper as “traditional models” and all only a few layers deep and operate on very simplified inputs, such as pre-defined line segments, a few pixels, or even a list of prerecognized letters. The Heilbron et al. 2020 study that the Reviewer refers to is a good example of such a model. (This excellent and relevant work was somehow overlooked in our literature discussion in the Introduction. We thank the Reviewer for pointing it out to us.) Models incorporating realistic feed-back activity need these simplifications, because they have a tendency to no longer converge when there are too many layers and units. However, in order for models of reading to be able to simulate cognitive behavior such as resolving variations in font size or typeface, or distinguish text from non-text, they need to operate on something close to the pixel-level data, which means they need many layers and units.

      Hence, as a stepping stone, it is reasonable to evaluate a model that has the necessary scale, but lacks the feed-back connections that would be problematic at this scale, to see what it can and cannot do in terms of explaining experimental effects in neuroimaging studies. This was the intended scope of our study. For the revision, we have attempted to make this more clear.

      We have changed the title to be:

      “Convolutional networks can model the functional modulation of the MEG responses associated with feed-forward processes during visual word recognition” and added the following to the Introduction:

      “The simulated environments in these models are extremely simplified, partly due to computational limitations and partly due to the complex interaction of feed-forward and feed-back connectivity that causes problems with convergence when the model grows too large. Consequently, these models have primarily focused on feed-back lexico-semantic effects while oversimplifying the initial feed-forward processing of the visual input. 

      […]

      This rather high level of visual representation sidesteps having to deal with issues such as visual noise, letters with different scales, rotations and fonts, segmentation of the individual letters, and so on. More importantly, it makes it impossible to create the visual noise and symbol string conditions used in the MEG study to modulate the type-I and type-II components. In order to model the process of visual word recognition to the extent where one may reproduce neuroimaging studies such as Vartiainen et al. (2011), we need to start with a model of vision that is able to directly operate on the pixels of a stimulus. We sought to construct a model that is able to recognize words regardless of length, size, typeface and rotation with very high accuracy, whilst producing activity that mimics the type-I, type-II, and N400m components which serve as snapshots of this process unfolding in the brain. For this model, we chose to focus on the early feed-forward processing occurring during visual word recognition, as the experimental setup in the MEG study was designed to demonstrate, rather than feed-back effects

      […]

      By doing so, we restrict ourselves to an investigation of how well the three evoked components can be explained by a feed-forward CNN in an experimental setting designed to demonstrate feed-forward effects. > As such, the goal is not to present a complete model of all aspects of reading, which should include feed-back effects, but rather to demonstrate the effectiveness of using a model that has a realistic form of input when the aim is to align the model with the evoked responses observed during visual word recognition.”

      And we have added the following to the Discussion section:

      “In this paper we have restricted our simulations to feed-forward processes. Now, the way is open to incorporate convolution-and-pooling principles in models of reading that simulate feed-back processes as well, which should allow the model to capture more nuance in the Type-II and N400m components, as well as extend the simulation to encompass a realistic semantic representation. A promising way forward may be to use a network architecture like CORNet (Kubilius et al., 2019), that performs convolution multiple times in a recurrent fashion, yet simultaneously propagates activity forward after each pass. The introduction of recursion into the model will furthermore align it better with traditional-style models, since it can cause a model to exhibit attractor behavior (McLeod et al., 2000), which will be especially important when extending the model into the semantic domain.

      Furthermore, convolution-and-pooling has recently been explored in the domain of predictive coding models (Ororbia & Mali, 2023), a type of model that seems particularly well suited to model feed-back processes during reading (Gagl et al., 2020; Heilbron et al., 2020; Nour Eddine et al., 2024).”

      We also would like to point out to the Reviewer that we did in fact perform a correlation between the model and the MNE-dSPM source estimate of all cortical locations and timepoints (Figure 7B). Such a brain-wide correlation map confirms that the three dipole groups are excellent summaries of when and where interesting effects occur within this dataset.

      The presented model needs substantial further development to be able to replicate, both behaviorally and neurally, many of the well-characterized phenomena seen in human behavior and neural recordings that are fundamental hallmarks of human visual word processing. Until that point, it is unclear what novel contributions can be gleaned from correlating low-dimensional model weights from these computational models with human neural data.

      We hope that our revisions have clarified the goals and scope of this study. The CNN model we present in this study is a small but, we feel, essential piece in a bigger effort to employ deep learning techniques to further enhance already existing models of reading. In our revision, we have extended our discussion where to go from here and outline our vision on how these techniques could help us better model the phenomena the reviewer speaks of. We agree with the reviewer that there is a long way to go, and we are excited to be a part of it.

      In addition to the changes described above, we now end the Discussion section as follows: 

      “Despite its limitations, our model is an important milestone for computational models of reading that leverages deep learning techniques to encompass the entire computational process starting from raw pixels values to representations of wordforms in the mental lexicon. The overall goal is to work towards models that can reproduce the dynamics observed in brain activity observed during the large number of neuroimaging experiments performed with human volunteers that have been performed over the last few decades. To achieve this, models need to be able to operate on more realistic inputs than a collection of predefined lines or letter banks (for example: Coltheart et al., 2001; Heilbron et al., 2020; Laszlo & Armstrong, 2014; McClelland & Rumelhart, 1981; Nour Eddine et al., 2024). We have shown that even without feed-back connections, a CNN can simulate the behavior of three important MEG evoked components across a range of experimental conditions, but only if unit activations are noisy and the frequency of occurrence of words in the training dataset mimics their frequency of use in actual language.”

      Reviewer #3 (Public Review):

      The paper is rather qualitative in nature. In particular, the authors show that some resemblance exists between the behavior of some layers and some parts of the brain, but it is hard to quantitively understand how strong the resemblances are in each layer, and the exact impact of experimental settings such as the frequency balancing (which seems to only have a very moderate effect according to Figure 5).

      The large focus on a qualitative evaluation of the model is intentional. The ability of the model to reproduce experimental effects (Figure 4) is a pre-requisite for any subsequent quantitative metrics (such as correlation) to be valid. The introduction of frequency balancing is a good example of this. As the reviewer points out, frequency balancing during training has only a moderate impact on correlation scores and from that point of view does not seem impactful. However, when we look at the qualitative evaluation, we see that with a large vocabulary, a model without frequency balancing fails to properly distinguish between consonant strings and (pseudo)words (Figure 4, 5th row). Hence, from the point of view of being able to reproduce experimental effects, frequency balancing has a large impact.

      That said, the reviewer is right to highlight the value of quantitative analysis. An important limitation of the “traditional” models of reading that do not employ deep learning is that they operate in unrealistically simplified environments (e.g. input as predefined line segments, words of a fixed length), which makes a quantitative comparison with brain data problematic. The main benefit that deep learning brings may very well be the increase in scale that makes more direct comparisons with brain data possible. In our revision we attempt to capitalize on this benefit more. The reviewer has provided some helpful suggestions for doing so in their recommendations, which we discuss in detail below.

      We have added the following discussion on the topic of qualitative versus quantitative analysis to the Introduction:

      “We sought to construct a model that is able to recognize words regardless of length, size, typeface and rotation, as well as humans can, so essentially perfectly, whilst producing activity that mimics the type-I, type-II, and N400m components which serve as snapshots of this process unfolding in the brain.

      […]

      These variations were first evaluated on their ability to replicate the experimental effects in that study, namely that the type-I response is larger for noise embedded words than all other stimuli, the type-II response is larger for all letter strings than symbols, and that the N400m is larger for real and pseudowords than consonant strings. Once a variation was found that could reproduce these effects satisfactorily, it was further evaluated based on the correlation between the amount of activation of the units in the model and MEG response amplitude.”

      And follow this up in the Discussion with a new sub-section entitled “On the importance of experimental contrasts and qualitative analysis of the model”

      The experiments only consider a rather outdated vision model (VGG).

      VGG was designed to use a minimal number of operations (convolution-and-pooling, fullyconnected linear steps, ReLU activations, and batch normalization) and rely mostly on scale to solve the classification task. This makes VGG a good place to start our explorations and see how far a basic CNN can take us in terms of explaining experimental MEG effects in visual word recognition. However, we agree with the reviewer that it is easy to envision more advanced models that could potentially explain more. In our revision, we expand on the question of where to go from here and outline our vision on what types of models would be worth investigating and how one may go about doing that in a way that provides insights beyond higher correlation values.

      We have included the following in our Discussion sub-sections on “Limitations of the current model and the path forward”:

      “The VGG-11 architecture was originally designed to achieve high image classification accuracy on the ImageNet challenge (Simonyan & Zisserman, 2015). Although we have introduced some modifications that make the model more biologically plausible, the final model is still incomplete in many ways as a complete model of brain function during reading.

      […]

      In this paper we have restricted our simulations to feed-forward processes. Now, the way is open to incorporate convolution-and-pooling principles in models of reading that simulate feed-back processes as well, which should allow the model to capture more nuance in the Type-II and N400m components, as well as extend the simulation to encompass a realistic semantic representation. A promising way forward may be to use a network architecture like CORNet (Kubilius et al., 2019), that performs convolution multiple times in a recurrent fashion, yet simultaneously propagates activity forward after each pass. The introduction of recursion into the model will furthermore align it better with traditional-style models, since it can cause a model to exhibit attractor behavior (McLeod et al., 2000), which will be especially important when extending the model into the semantic domain. Furthermore, convolution-and-pooling has recently been explored in the domain of predictive coding models (Ororbia & Mali, 2023), a type of model that seems particularly well suited to model feed-back processes during reading (Gagl et al., 2020; Heilbron et al., 2020; Nour Eddine et al., 2024).”

      Reviewer #3 (Recommendations For The Authors):

      (1) The method used to select the experimental conditions under which the behavior of the CNN is the most brain-like is rather qualitative (Figure 4). It would have been nice to have a plot where the noisyness of the activations, the vocab size and the amount of frequency balancing are varied continuously, and show how these three parameters impact the correlation of the model layers with the MEG responses.

      We now include this analysis (Figure 6 in the revised manuscript, Supplementary Figures 47) and discuss these factors in the revised Results section:

      “Various other aspects of the model architecture were evaluated which ultimately did not lead to any improvements of the model. The response profiles can be found in the supplementary information (Supplementary Figures 4–7) and the correlations between the models and the MEG components are presented in Figure 6. The vocabulary of the final model (10 000) exceeds the number of units in its fullyconnected layers, which means that a bottleneck is created in which a sub-lexical representation is formed. The number of units in the fully-connected layers, i.e. the width of the bottleneck, has some effect on the correlation between model and brain (Figure 6A), and the amount of noise added to the unit activations less so (Figure 6B). We already saw that the size of the vocabulary, i.e. the number of wordforms in the training data and number of units in the output layer of the model, had a large effect on the response profiles (Figure 4). Having a large vocabulary is of course desirable from a functional point of view, but also modestly improves correlation between model and brain (Figure 6C). For large vocabularies, we found it beneficial to apply frequency-balancing of the training data, meaning that the number of times a word-form appears in the training data is scaled according to its frequency in a large text corpus. However, this cannot be a one-to-one scaling, since the most frequent words occur so much more often than other words that the training data would consist of mostly the top-ten most common words, with less common words only occurring once or not at all. Therefore, we decided to scale not by the frequency 𝑓 directly, but by 𝑓𝑠, where 0 < 𝑠 < 1, opting for 𝑠 = 0.2 for the final model (Figure 6D).”

      (2) It is not clear which layers exactly correspond to which of the three response components. For this to be clearer, it would have been nice to have a plot with all the layers of VGG on the x-axis and three curves corresponding to the correlation of each layer with each of the three response components.

      This is a great suggestion that we were happy to incorporate in the revised version of the manuscript. Every figure comparing the response patterns of the model and brain now includes a panel depicting the correlation between each layer of the model and each of the three MEG components (Figures 4 & 5, Supplementary Figures 2-5). This has given us (and now also the reader) the ability to better benchmark the different models quantitatively, adding to our discussion on qualitative to quantitative analysis.

      (3) It is not clear to me why the authors report the correlation of all layers with the MEG responses in Figure 5: why not only report the correlation of the final layers for N400, and that of the first layers for type-I?

      We agree with the reviewer that it would have been better to compare the correlation scores for those layers which response profile matches the MEG component. While the old Figure 5 has been merged with Figure 4, and now provides the correlations between all the layers and all MEG components, we have taken the Reviewer’s advice and marked the layers which qualitatively best correspond to each MEG component, so the reader can take that into account when interpreting the correlation scores.

      (4) The authors mention that the reason that they did not reproduce the protocol with more advanced vision models is that they needed the minimal setup capable of yielding the desired experiment effect. I am not fully convinced by this and think the paper could be significantly strengthened by reporting results for a vision transformer, in particular to study the role of attention layers which are expected to play an important role in processing higher-level features.

      We appreciate and share the Reviewer’s enthusiasm in seeing how other model architectures would fare when it comes to modeling MEG components. However, we regard modifying the core model architecture (i.e., a series of convolution-and-pooling followed by fully-connected layers) to be out of scope for the current paper.

      One of the key points of our study is to create a model that reproduces the experimental effects of an existing MEG study, which necessitates modeling the initial feed-forward processing from pixel to word-form. For this purpose, a convolution-and-pooling model was the obvious choice, because these operations play a big role in cognitive models of vision in general. In order to properly capture all experimental contrasts in the MEG study, many variations of the CNN were trained and evaluated. This iterative design process concluded when all experimental contrasts could be faithfully reproduced.

      If we were to explore different model architectures, such as a transformer architecture, reproducing the experimental contrasts of the MEG study would no longer be the end goal, and it would be unclear what the end goal should be. Maximizing correlation scores has no end, and there are a nearly endless number of model architectures one could try. We could bring in a second MEG study with experimental contrasts that the CNN cannot explain and a transformer architecture potentially could and set the end goal to explain all experimental effects in both MEG studies. But even if we had access to such a dataset, this would almost double the length of the paper, which is already too long.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      (1) The results do not support the conclusions. The main "selling point" as summarized in the title is that the apoptotic rate of zebrafish motorneurons during development is strikingly low (~2% ) as compared to the much higher estimate (~50%) by previous studies in other systems. The results used to support the conclusion are that only a small percentage (under 2%) of apoptotic cells were found over a large population at a variety of stages 24-120hpf. This is fundamentally flawed logic, as a short-time window measure of percentage cannot represent the percentage on the long-term. For example, at any year under 1% of human population die, but over 100 years >99% of the starting group will have died. To find the real percentage of motorneurons that died, the motorneurons born at different times must be tracked over long term, or the new motorneuron birth rate must be estimated. Similar argument can be applied to the macrophage results.<br />

      In the revised manuscript (revised Figure 4), we extended the observation time window as long as possible, from 24 hpf to 240 hpf. After 240 hpf, the transparency of zebrafish body decreased dramatically, which made optical imaging quite difficult.

      We are confident that this 24-240 hpf time window covers the major time window during which motor neurons undergo programmed cell death during zebrafish early development. We chose the observation time window based on the following two reasons: 1) Previous studies showed that although the time windows of motor neuron death vary in chick (E5-E10), mouse (E11.5-E15.5), rat (E15-E18), and human (11-25 weeks of gestation), the common feature of these time windows is that they are all the developmental periods when motor neurons contact with muscle cells. The contact between zebrafish motor neurons and muscle cells occurs before 72 hpf, which is included in our observation time window. 2) Most organs of zebrafish form before 48-72 hpf, and they complete hatching during 48-72 hpf. Food-seeking and active avoidance behaviors also start at 72 hpf, indicating that motor neurons are fully functional at 72 hpf.

      Previous studies in zebrafish have shown that the production of spinal cord motor neurons largely ceases before 48 hpf, and then the motor neurons remain largely constant until adulthood (doi: 10.1016/j.celrep.2015.09.050; 10.1016/j.devcel.2013.04.012; 10.1007/BF00304606; 10.3389/fcell.2021.640414). Our observation time window covers the major motor neuron production process. Therefore, we believe that neurogenesis will not affect our findings and conclusions.

      Although we are confident that 240 h tracking is long enough to measure the motor neuron death rate, several sentences have been added in the discussion part, “In our manuscript, we tracked the motor neuron death in live zebrafish until 240 hpf, which was the longest time window we could achieve. But there was still a possibility that zebrafish motor neurons might die after 240 hpf.”

      We agreed that the “2%” description might not be very accurate. Thus, we have revised our title to “Zebrafish live imaging reveals a surprisingly small percentage of spinal cord motor neurons die during early development.”

      (2) The conclusion regarding timing of axon and cell body caspase activation and apoptosis timing also has clear issues. The ~minutes measurement are too long as compared to the transport/diffusion timescale between the cell body and the axon, caspase activity could have been activated in the cell body and either caspase or the cleaved sensor move to the axon in several seconds. The authors' results are not high frequency enough to resolve these dynamics. Many statements suggest oversight of literature, for example, in abstract "however, there is still no real-time observation showing this dying process in live animals.".

      Real-time imaging of live animals is quite challenging in the field. Currently, using confocal microscopy, we can only achieve minute-scale tracking. In the future, with more advanced imaging techniques, the sensor fish in the present study may provide us with more detailed information on motor neuron death. We have removed “real-time” from our revised manuscript. We also revised the mentioned sentence in the abstract.

      (3) Many statements should use more scholarly terms and descriptions from the spinal cord or motorneuron, neuromuscular development fields, such as line 87 "their axons converged into one bundle to extend into individual somite, which serves as a functional unit for the development and contraction of muscle cells"

      We have removed this sentence.

      (4) The transgenic line is perhaps the most meaningful contribution to the field as the work stands. However, mnx1 promoter is well known for its non-specific activation - while the images do suggest the authors' line is good, motorneuron markers should be used to validate the line. This is especially important for assessing this population later as mnx1 may be turned off in mature neurons. The author's response regarding mnx1 specificity does not mitigate the original concern.

      The mnx1 promoter has been widely used to label motor neurons in transgenic zebrafish. Previous studies have shown that most of the cells labeled in the mnx1 transgenic zebrafish are motor neurons. In this study, we observed that the neuronal cells in our sensor zebrafish formed green cell bodies inside of the spinal cord and extended to the muscle region, which is an important morphological feature of the motor neurons.

      Furthermore, a few of those green cell bodies turned into blue apoptotic bodies inside the spinal cord and changed to blue axons in the muscle regions at the same time, which strongly suggests that those apoptotic neurons are not interneurons.

      In fact, no matter what method is used, such as using antibodies to stain specific markers to label motor neurons, 100% specificity cannot be achieved. More importantly, although the mnx1 promoter might have labeled some interneurons, this will not affect our major finding that only a small percentage of spinal cord motor neurons die during the early development of zebrafish.

      Reviewer 2:

      (1) Title: The 50% figure of motor neurons dying through apoptosis during early vertebrate development is not precisely accurate. In papers referenced by the authors, there is a wide distribution of percentages of motor neurons that die depending on the species and the spinal cord region. In addition, the authors did not examine limb-innervating motor neurons, which are the ones best studied in motor neuron programmed cell death in other species. Thus, a better title that reflects what they actually show would be something like "A surprisingly small percentage of early developing zebrafish motor neurons die through apoptosis in non-limb innervating regions of the spinal cord."

      In fish, there are no such structures as limbs, although fins may be evolutionarily related to limbs. In our manuscript, we studied the naturally occurring motor neuron death in the whole spinal cord during the early stage of zebrafish development. The death of motor neurons in limb-innervating motor neurons has been extensively studied in chicks and rodents, as it is easy to undergo operations such as amputation. However, previous studies have shown this dramatic motor neuron death occurs not only in limb-innervating motor neurons but also in other spinal cord motor neurons (doi: 10.1006/dbio.1999.9413).

      We have revised our title to “Zebrafish live imaging reveals a surprisingly small percentage of spinal cord motor neurons die during early development.”

      (2) lines 18-19: "embryonic stage of vertebrates" is very broad, since zebrafish are also vertebrates; it would be better to be more specific

      lines 25-26: The authors should be more specific about which animals have widespread neuronal cell death.

      We have revised our manuscript accordingly.

      (3) lines 98-99; 110-111; 113; 122-123; 140-141: A cell can undergo apoptosis. But an axon, which is only part of a cell, cannot undergo apoptosis. Especially since the axon doesn't have a separate nucleus, and the definition of apoptosis usually includes nuclear fragmentation. A better subheading would describe the result, which is that caspase activation is seen in both the cell body and the axon.

      We have revised the subheadings and related words in the manuscript accordingly. In the introduction, we also revised the expression of the third aim from “Which part of a neuron (cell body vs. axon) will die first?” to “Which part of a neuron (cell body vs. axon) will degrade first?”.

      (4) lines 159-160; 178-179: This is an oversimplification of the literature. The authors should spell out which populations of motor neuron have been examined and say something about the similarities and difference in motor neuron death.

      We have revised it accordingly.

      (5) lines 200; 216: The authors did not observe macrophages engulfing motor neurons. But that does not mean that they cannot. Making the conclusion stated in this subheading would require some kind of experiment, not just observations.

      We did observe few colocalizations of macrophages and dead motor neurons.  To more accurately express these data, in the revised manuscript, we used “colocalization” to replace “engulfment.” The subheading has been revised to “Most dead motor neurons were not colocalized with macrophages.” Accordingly, panel C of Figure 5 has also been revised.

      (6) lines 234-246: The authors seem to have missed the point about VaP motor neuron death, which was two-fold. First, VaP death has been previously described, thus it could serve as a control for the work in this paper, especially since the conditions underlying VaP death and survival have been experimentally tested. Second, they should acknowledge that previous work showed that at least some motor neuron death in zebrafish differs from that described in chick and rodents. This conclusion came from work showing that death of VaP is independent of limitations in muscle innervation area, suggesting it is not coupled to muscle-derived neurotrophic factors.

      Figures: The authors should say which level of the spinal cord they examined in each figure.

      We have compared our findings with previous findings in the revised manuscript. The death of VaP motor neurons is not related to neurotrophic factors, but the death of other motor neurons may be related to neurotrophic factors, which needs further study and evidence. Our study examined the overall motor neuron apoptosis regardless of the causes and locations. To avoid misunderstanding, in the revised manuscript, we removed the data and words related to neurotrophic factors.

      We also extended the observation time window as long as possible, from 24 hpf to 240 hpf (revised Figure 4). After 240 hpf, the transparency of zebrafish body decreased dramatically, which made the optical imaging quite difficult.

    1. Figur 4: Område med potentiell kontaminering i anslutning till Kv.10.

      Vilken är datakällan?

      Finns det någon mer information som kan inkluderas i popupen? Nu står det bara "1" när man klickar på punkten.

    2. En befintlig installation av borrhål väster om Kv.10 utgör ett uppmuntrande tecken.

      Beskriv varför? Hänvisa till kartmaterialet som visar att det finns ett borrhålslager där tidigare, vilket betyder att det har utfärdats borrtillstånd i området tidigare.

    3. optimistiskt antagande om 6 meters medelavstånd

      Skriv att (eller liknande):

      Borrningsytor på respektive fastighet har identiferiats med hänsyn till [ledningar, avstånd till grannfastigheter med mera..]. Antalet borrhål som får plats inom dessa ytor varierar beroende på hur geoenergisystemet utformas - om borrhålslagret är välbalanserad med avseende på värmeuttag- och tillförsel över året kan borrhålen borras med tätare mellanrum och vice versa.

      Av ovanstående anledning har antalet borrhål bedömts för två olika scenarion - 6 m avstånd som speglar ett fall med balanserat borrhålslager samt 10 m avstånd för ett fall då lagret är relativt obalanserat.

    4. Fastigheten, benämnd Kv.38A, är belägen i den norra delen av området i Östra Hagastaden – med Norra Stationsvägen i söder och E4/E20 samt Värtabanan i norr. Följande dokument visar underlag och ritningar:

      Jättebra med dessa uppdaterade kartor.

      Tror det behövs lite förklaringar till lagrena så att man som utomstående förstår vad de betyder. typ

      • Identifierad borrningsyta
      • Centrumpunkt borrhål

      etc. Denna förklaring kan skrivas i texten - men även kartlegenden kan vara lite mer beskrivande

    1. I avoid contact with fans. Occasionally, I watch trash TV because I think the poet shouldn’t avert his eyes. I want to know what others aspire to. I’m a good but limited cook. My steaks are excellent, but they’ll never touch what you can get on any street corner in Argentina. Tree huggers are suspicious to me. Yoga classes for five-year-olds—which in California are a thing—are suspicious to me. I don’t use social media. If you see my profile anywhere there, you can be sure it’s a fake. I don’t use a smartphone. I never quite trust the media, so I get a truer picture of the political situation by going to multiple sources—the Western media, Al Jazeera, Russian TV, and occasionally by downloading the whole of a politician’s speech. I trust the Oxford English Dictionary, which is one of mankind’s greatest cultural achievements. I mean the one in twenty massive volumes with six hundred thousand entries and more than three million quotations culled from all over the English-speaking world and over a thousand years. I reckon thousands of researchers and amateur helpers spent 150 years combing through everything recorded. For me, it is the book of books, the one I would take to a desert island. It is inexhaustible, a miracle. The first time I visited Oliver Sacks on Wards Island north of Manhattan, I had mislaid the house number but knew the name of the little street. It was evening, winter-time; the slightly sloping street was icy. I parked and tiptoed along the icy pavement looking into every lit-up home. None of the windows had curtains. Through one window I saw a man sprawled on a sofa with one of the hefty volumes of the OED propped on his chest. I knew that had to be him, and so it was. Our first subject was the dictionary; for him as well, it was the book of books.

      The book of books.

    1. Placering av Nyttjänstenät Eftersom projektet fortfarande befinner sig i ett tidigt skede är positionen för nyttjänstenäten preliminär. Det är viktigt att synkronisera borrhålsplaneringen med eventuella förändringar i nätens placering.

      Vad är nyttjänstenät?

    2. Resultat och Analys

      Beskriv vilka kriterier som vi har tittat på för att ta fram dessa uppskattningar. Kanske mer logiskt att ha avsnittet "Specifika problemområden" först som du inleder med vilka förutsättnignar som krävs för att man ska kunna borra (eller inte kunna borra)

    3. igur 1: Hagastaden – västra och östra

      Centrera caption, gäller alla figurer.

      Går det att skapa en frame kring kartan?

      UX-mässigt känns det som att detta är en bild/printscreen typ och inte en interaktiv karta :) Den är riktigt snygg gjord men läsaren behöver förstå vad man kan göra. Du kanske kan lägga en backround map som är aktiv som default med låg opacitet så att din georefererade pdf sticker ut samtidigt som man förstår att det är en karta.

      Man kan skapa en callout note där man förklarar att rapporten är interaktiv. SÅ här har jag skrivit i en annan rapport:

      "Denna rapport utförs i form av en s.k. interaktiv rapport, vilket innebär att läsaren kan interagera med data och resultat i figurer och andra media i rapporten. Syftet är att ge läsaren en ökad inblick i och förståelse av de resultat som rapporten förmedlar, t.ex. via möjligheten att se extra information genom att hovra över datapunkter eller att zooma in intressanta områden i figurer."

      Också: highlighta på ett bättre sätt vilka fastigheterna är, trots att det finns massa underlag inlagt är det inte helt uppenbart vilka fastigheter som rapporten handlar om.

    4. borrbara områden med utgångspunkt i antaganden om borrhålsavstånd (6 m respektive 10 m)

      Du kan nog vara mindre specifik här i inledningen, skriv istället "bedömnings av tillgängliga borrningsytor"

    1. their apartment had no windows to the outside; nobody wanted to see theovercrowded external world

      It is weird how they are trying to help while ignoring the people that they are trying to help. Economic profit? The rich always end up trying to take advantage of the desperate and poor situation of people.

    1. The mvnw file is a shell script for Linux and MacOS while mvnw.cmd is for Windows. The benefit of this Maven wrapper is that we can easily switch projects that depend on different Maven versions without much setup on our local machine. While Maven is quite mature and most versions are backward-compatible (especially since Maven 3), things are completely different when using Gradle, where this wrapper concept was inspired from.

      wrapper 的概念是 maven 从 gradle 那里借鉴过来的

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates the relationship between climate variables and malaria incidence from monthly records, for rainfall, temperature, and a measure of ENSO, in a lowland region of Kenya in East Africa. Wavelet analyses show significant variability at the seasonal scale at the 6-month scale with some variation in its signal over time, and some additional variability at the 12-month scale for some variables. As conducted, the analyses show weak (non-significant) signals at the interannual time scales (longer than seasonal). Cross-wavelet analysis also highlights the 6-month scale and the association of malaria and climate variables at that scale, with some signal at 12 months, reflecting the role of climate in seasonality. Evidence is presented for some small changes in the lags of the response of malaria to the seasonal climate drivers over time.

      Strengths:

      Although there have been many studies of climate drivers of malaria dynamics in East Africa, these analyses have been largely focused on highlands where these drivers are expected to exhibit the strongest signal of association with disease burden at interannual and longer time scales. It is therefore of interest to take advantage of a relatively long time series of cases to examine the role of climate variables in more endemic malaria in lowlands.

      Weaknesses:

      (1) Major comments:

      The work is not sufficiently placed in the context of what is known about climate variability in East Africa, and the role of climate variables in the temporal variation of malaria cases in this region. This context includes the relationship between large (global/regional) drivers of interannual climate variability such as ENSO (and the Indian Ocean Dipole) and local temporal patterns in rainfall and temperature. There is for example literature on the influence of those drivers and the short and long rains in East Africa. That is, phenomena such as ENSO would influence malaria through those local climate variables. This context should be considered when formulating and interpreting the analyses.

      There are conceptual problems with the design of the analyses which can limit the findings on association. It is not surprising that rainfall would exhibit a clear association at seasonal scales. It is nevertheless valuable to confirm this as the authors have done and to examine the faster than 12-month scale, given the typical pattern of two rainfall seasons in this area. However, the results on temperature are less clear. If rainfall is the main limiting factor for the transmission season, the temperature variation that would matter can be during the rainy periods. One would then see an association with temperature only in particular windows of time during the year, when rainfall is sufficient (see for example, Rodo et al. Nat. Commun. 2022, for this finding in a highland region of Ethiopia). For this situation, there would be no clear association with temperature when all months are considered, and one would not find a significant relationship (or a lagged one) between peak times in this climate factor and malaria's seasonal cases. It would be difficult for the wavelet analysis to reveal such an effect. Another consideration is whether to use an ENSO variable that includes seasonality or to use an ENSO index computed as an anomaly, to focus on interannual variability. That is, it is most relevant to consider how ENSO influences time scales of variation longer than seasonal (the multiannual variation in seasonal epidemics) and for this purpose, one would typically rely on an anomaly. This choice would better enable one to see whether there is a role of ENSO at interannual time scales. It would also make sense to analyze with cross-wavelets the effect of ENSO on local climate factors, temperature, and rainfall, and not only on malaria. This would allow us to establish evidence for a chain of causality, from a global driver of interannual variability to local climate variability to malaria incidence.

      The multiresolution analysis and associated analysis of lag variations were confusing and difficult to follow as presented: (1) the lags chosen by the multiresolution analysis do not match the phase differences of the cross-wavelet analysis if I followed what was presented. On page 8, phase differences are expressed in months. I do not understand then the following statements on page 9: "The phase differences obtained by the cross-wavelet transforms were turned into lags, allowing us to plot the evolution of the lags over time". The resulting lags in Figure 6 are shorter than the phase differences provided in the text on page 8. (2) The phase difference of the cross-wavelet analyses for malaria and temperature is also too long for this climate factor to explain an effect on the vector and then on the disease. (3) In Table 3, the regression results that are highlighted are those for Land Surface Temperatures (LST) and ENSO, with a weak but significant negative linear correlation, and for LST and bednet coverage, and this is considered part of the lag analysis. The previous text and analyses up to that point do not seem to consider the relationship of ENSO and local climate variables, or that between local climate variables and bednets (which would benefit from some context for the causal pathways this would reflect).

      The conclusion in the Abstract: "Our study underlines the importance of considering long-term time scales when assessing malaria dynamics. The presented wavelet approach could be applicable to other infectious diseases" needs to be reformulated. The use of "long-term" time scales for those of ENSO and interannual variability is not consistent with the climate literature, where long-term could be interpreted as decadal and longer. The time scales beyond those of seasonality, especially those of climate variability, have been addressed in many malaria studies. It is not compelling to have the significance of this study be the importance of considering those time scales. This is not new. I recommend focusing on what has been done for lowland malaria and endemic regions (for example, in Laneri et al. PNAS 2015) as there has been less work for those regions than for seasonal epidemic ones of low transmission (e.g. altitude fringes and desert ones, e.g. Laneri et al. PloS Comp. Biol. 2010; Roy et al. Mal. J. 2015). Also, wavelet analyses have been used extensively by now to consider the association of climate variables and infectious diseases at multiple time scales. There is here an additional component of the analysis but the decomposition that underlies the linear regressions is also not that new, as decompositions of time series have been used before in this area. In summary, I recommend a more appropriate and compelling conclusion on what was learned about malaria at this location and what it may tell us about other, similar, locations, but not malaria dynamics everywhere.

      The conversion from monthly cases to monthly incidence needs a better explanation of the Methods, rather than a referral to another paper. This is a key aspect of the data. It may be useful to plot the monthly time series of both variables in the Supplement, for comparison.

      There is plenty of evidence of the seasonal role of rainfall on malaria's seasonality in many regions. The literature cited here to support this well-known association is quite limited. It would be useful to provide a context that better reflects the literature and some context for the environmental conditions of this lowland region that would explain the dominant role of rainfall on malaria seasonality. Two papers (from 2017 and 2019) are cited in the second paragraph of the introduction as showing that "key climatic factors are rainfall and temperatures". This is a misrepresentation of the field. That these factors matter to malaria in general has been known for a very long time given that the vectors are mosquitoes, and the cited studies are particular ones that examine the mechanistic basis of this link for modeling purposes. Either these papers are presented as examples, with a more accurate description of what they add to the earlier literature or earlier literature should be acknowledged. Also, what has been much less studied is the role of these variables at interannual time scales, as potentially mediating the effects of global drivers in teleconnections.

      (2) Minor comments:

      In relation to the conceptual issues raised above, it would be valuable to consider whether the negative association with temperature persists if one considers mean temperature during the rainy seasons only, against the total cases in the transmission season each year (as in Rodó et al. 2021). This would allow one to disentangle whether the negative association reflects a robust result or an artifact of an interaction between temperature and rainfall so that the former matters when the latter is permissive for transmission.

      The conclusion in the Discussion " This suggests that minor climate variations have a limited impact on malaria incidence at shorter time scales, whereas climatic trends may play a more substantial role in shaping long-term malaria dynamics" is unsubstantiated. There is no clear result in the paper on climatic trends that I can see.

      The Abstract writes: "The true impact of climate change...". This paper is not about climate change but about climate seasonality and variability. This text needs to be changed to make it consistent with the content of the paper.

      Page 2, Introduction: The statement on Pascual et al. 2008 is not completely accurate. This paper shows an interplay of climate variability and disease dynamics, but not cycles that are completely independent of climate.

      Page 2, next sentence: "More recently, such cycles have been attributed to global climate drivers such as ENSO (Cazelles et al., 2023)". This writing is also somewhat unclear. Are you referring to the cycles for the same location in Kenya? Or generically, to the interannual variability of malaria?

      There are multiple places in the writing that could be edited.

    1. В многопоточном приложении Windows Forms запрещено вызывать методы и свойства элементов управления из потоков, отличных от того, в котором они были созданы.

      нельзя просто так взять объект из другого потока, и как то его использовать. это как нельзя брать чужую игрушку и с ней играть

    2. Типы из пространства имен System.Windows.Forms интенсивно вызывают Win32-код, разработанный в расчете на работу в однопоточном апартаменте. По этой причине main-метод программы Windows Forms должен быть помечен как [STAThread], иначе при вызове Win32 UI-кода произойдет одно из двух:

      один должны по очереди вызывать, иначе может быть блокирвока

    1. AbstractThe development of long-read sequencing is promising to high-quality and comprehensive de novo assembly for various species around the world. However, it is still challenging for genome assemblers to well-handle thousands of genomes, tens of gigabase level genome sizes and terabase level datasets simultaneously and efficiently, which is a bottleneck to large de novo sequencing studies. A major cause is the read overlapping graph construction that state-of-the-art tools usually have to cost terabyte-level RAM space and tens of days for that of large genomes. Such lower performance and scalability are not suited to handle the numerous samples to be sequenced. Herein, we propose xRead, an iterative overlapping graph approach that achieves high performance, scalability and yield simultaneously. Under the guidance of its novel read coverage-based model, xRead uses heuristic alignment skeleton approach to implement incremental graph construction with highly controllable RAM space and faster speed. For example, it enables to process the 1.28 Tb A. mexicanum dataset with less than 64GB RAM and obviously lower time-cost. Moreover, the benchmarks on the datasets from various-sized genomes suggest that it achieves higher accuracy in overlap detection without loss of sensitivity which also guarantees the quality of the produced graphs. Overall, xRead is suited to handle numbers of datasets from large genomes, especially with limited computational resources, which may play important roles in many de novo sequencing studies.

      This work has been peer reviewed in GigaScience (https://doi.org/10.1093/gigascience/giaf007), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer #2: Anuradha Wickramarachchi

      Overall comments.

      Authors of the manuscript have developed an iterative overlap graph construction algorithm to support genome assembly. This is both an interesting and a demanding area of research due to very recent advancements in sequencing technologies.

      Although the text in the manuscript is interesting, grammar must be rechecked and revised. At some point it is difficult to keep track of the content and references to supplementary to make sense out of the content.

      Specific comments

      Page 1 Line 13: I believe the authors are talking about assembly sizes and not genome sizes. The sentences here could be a bit short to make them easy to understand.

      Page 2 Line 19: Theoretical time complexity O(m2n2) is bit of an overstatement due to the heuristics employed by most assemblers. For example, mash distance, minimisers and k-mer bins are there to prevent this explosion of complexity. Either acknowledge such methods or provide a range for the time complexity. I would be interesting to know the time complexities of the methods expressed in sentence starting Line 15.

      Page 5 Line 11: Was this performed with overlapping windows of 1gb? Otherwise, simulations may not have reads spanning across such regions.

      Page 5 Line 14: It seems you are simulating 9 + 4 + 4 datasets. This is unclear, please make this into bullet points or separate paragraphs and explain clearly. Include simulator information in the table itself by may be making it landscape (in supplementary).

      Fig 2: I believe authors should expand their analysis to more recent and popular assemblers. For example, wtdbg2 is designed for noisy reads and not specifically for more accurate R10/ HiFi reads. So please include, HiFi-asm, Flye where appropriate. Flye supports ONT out of the box and in my experience does produce good assemblies.

      Although, you are evaluating read overlaps, it is hard to ignore assemblers themselves just because they do not produce intermediate overlaps graphs.

      Page 5-9: In the benchmarks section, please include how True Positives and False Positives were labelled. Was this from simulation data?

      Page 11: Use of xRead has been evaluated on genome assemblies. This is a very important and it is a bit unfortunate that existing assemblers are not very flexible in terms of plugging in new intermediate steps. It might be worth exploring into creating a new assembler using the wtpoa2 cli command of wtdbg2.

      Page 16: What will happen if you only capture reads from a single chromosome due to longer length? I believe the objective is to gather longest reads capturing as much as possible covering the whole genome. Please comment on this.

      Page 19: In the Github Readme the download URL was wrong. Please correct it to the latest release

      Correct: https://github.com/tcKong47/xRead/releases/download/xRead-v1.0.0.1/xRead-v1.0.0.tar.gz Existing: https://github.com/tcKong47/xRead/releases/download/v1.0.0/xRead-v1.0.0.tar.gz

      Make command failed with make: *** No rule to make target main.h', needed bymain.o'. Stop.

      It seems the release does not have source code, but rather the compiled version. Please update github instructing how to compile code properly with a git clone.

  4. Feb 2025
    1. Reviewer #3 (Public review):

      Summary:

      This paper aims to investigate how the human brain represents different forms of value and uncertainty that participate in active inference within a free-energy framework, in a two-stage decision task involving contextual information sampling, and choices between safe and risky rewards, which promotes shifting between exploration and exploitation. They examine neural correlates by recording EEG and comparing activity in the first vs second half of trials and between trials in which subjects did and did not sample contextual information, and perform a regression with free-energy-related regressors against data "mapped to source space."

      Strengths:

      This two-stage paradigm is cleverly designed to incorporate several important processes of learning, exploration/exploitation and information sampling that pertain to active inference. Although scalp/brain regions showing sensitivity to the active-inference related quantities do not necessarily suggest what role they play, they are illuminating and useful as candidate regions for further investigation. The aims are ambitious, and the methodologies are impressive. The paper lays out an extensive introduction to the free energy principle and active inference to make the findings accessible to a broad readership.

      Weaknesses:

      It is worth noting that the high lower-cutoff of 1 Hz in the bandpass filter, included to reduce the impact of EEG noise, would remove from the EEG any sustained, iteratively updated representation that evolves with learning across trials, or choice-related processes that unfold slowly over the course of the 2-second task windows. It is thus possible there are additional processes related to the active inference quantities that are missed here. This is not a flaw as one must always try to balance noise removal against signal removal in filter settings - it is just a caveat. As the authors also note, the regions showing up as correlated with model parameters change depending on source modelling method and correction for multiple comparisons, warranting some caution around the localisation aspect.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Guo and Uusisaari describes a series of experiments that employ a novel approach to address long-standing questions on the inferior olive in general and the role of the nucleoolivary projection specifically. For the first time, they optimized the ventral approach to the inferior olive to facilitate imaging in this area that is notoriously difficult to reach. Using this approach, they are able to compare activity in two olivary regions, the PO and DAO, during different types of stimulation. They demonstrate the difference between the two regions, linked to Aldoc-identities of downstream Purkinje cells, and that there is co-activation resulting in larger events when they are clustered. Periocular stimulation also drives larger events, related to co-activation. Using optogenetic stimulation they activate the nucleoolivary (N-O) tract and observe a wide range of responses, from excitation to inhibition. Zooming in on inhibition they test the assumption that N-O activation can be responsible for suppression of sensoryevoked events. Instead, they suggest that the N-O input can function to suppress background activity while preserving the sensory-driven responses.

      Strengths:

      This is an important study, tackling the long-standing issue of the impossibility to do imaging in the inferior olive and using that novel method to address the most relevant questions. The experiments are technically very challenging, the results are presented clearly and the analysis is quite rigorous. There is quite a lot of room for interpretation, see weaknesses, but the authors make an effort to cover many options.

      Weaknesses:

      The heavy anesthesia that is required during the experiment could severely impact the findings. Because of the anesthesia, the firing rate of IO neurons is found to be 0.1 Hz, significantly lower than the 1 Hz found in non-anesthetized mice. This is mentioned and discussed, but what the consequences could be cannot be understated and should be addressed more. Although the methods and results are described in sufficient detail, there are a few points that, when addressed, would improve the manuscript.

      We sincerely thank the reviewer for their encouraging comments and recognition of our study’s significance. We fully acknowledge the confounding effects of the deep anesthesia used in our experiments, which was necessary to ensure the animals’ welfare while establishing this technically demanding methodology. We elaborate on these effects below and will further clarify them in the revised manuscript.

      Ultimately, the full resolution of this issue will require recordings in awake animals, as we consider our approach an advancement from acute slice preparations but not yet a complete representation of in vivo IO function. However, key findings from our study—such as amplitude modulation with co-activation and the potential role of IO refractoriness in complex spike generation—could be further explored in existing cerebellar cortical recordings from awake, behaving animals. We hope our work will motivate re-examination of such datasets to assess whether these mechanisms contribute to overall cerebellar function.

      Reviewer #1 (Recommendations for the authors):

      On page 10 the authors indicate that 2084 events were included for DAO and 1176 for PO. Is that the total number of events? What was the average and the range per neuron and the average recording duration?

      Thank you for pointing out lack of clarity. The sentence should say "in total, 2084 and 1176 detected events from DAO and PO were included in the study". We will add the averages and ranges of events detected per neuron in different categories, as well as the durations of the recordings (ranging from 120s to 270s) to the tables.

      On page 10 it is also stated that: "events in PO reached larger values than those in DAO even though the average values did not differ". Please clarify that statement. Which parameter + p-value in the table indicates this difference?

      Apologies for omission. Currently the observation is only visible in the longer tail to the right in the PO data in Figure 2B2. We will add the range of values (3.0-75.2 vs 3.1-39.6 for PO and DAO amplitudes, respectively) in text and the tables in the revision.

      Abbreviating airpuff to AP is confusing, I would suggest not abbreviating it.

      Understood. We will change AP to airpuff in the text. In figure labels, at least in some panels, the abbreviation will be necessary due to space constraints.

      What type of pulse was used to drive ChrimsonR? Could it be that the pulse caused a rebound-like phenomenon with the pulse duration that drove the excitation?

      As described on line 229 and in the Methods, we used 5-second trains of 5-ms LED light pulses. Importantly, these stimulation parameters were informed by our extensive in vitro examination of various stimulation patterns (Lefler et al., 2014), which consistently produced stable postsynaptic responses without inducing depolarization or rebound effects. Additionally, Loyola et al. (2024) reported no evidence of rebound activity in IO cells following optogenetic activation of N-O axons in the absence of direct neuronal depolarization. We will incorporate these considerations into the discussion, while also acknowledging that unequivocal confirmation of “direct” rebound excitation would require intracellular recordings, such as patch clamp experiments.

      The authors indicate that the excitatory activity was indistinguishable in shape from other calcium activity, but can anything be said about the timing (the scale bar in Figure 4A2 has no value, is it the same 2s pulse)?

      Apologies for oversight in labeling the scale bar in Figure 4A2 (it is 2s). While we deliberately refrain from making strong claims regarding the origin of the NO-evoked spikes, their timing can be examined in more detail in Figure 4 - Supplement 1, panels C and D. We will make sure this is clearly stated in the revised text.

      Did the authors check for accidental sparse transfection with ChrimsonR of olivary neurons in the post-mortem analysis?

      Good point! However, we have never seen this AAV9-based viral construct to drive trans-synaptic expression in the IO, nor is this version of AAV known to have the capacity for transsynaptic expression in general.

      No sign of retrograde labeling (via the CF collaterals in the cerebellar nuclei) was seen either. Notably, the hSyn promoter used to drive ChrimsonR expression is extremely ineffective in the IO. Thus, we doubt that such accidental labeling could underlie the excitatory events seen upon N-O stimulation. We will add these mentions with relevant references to the discussion of the revised manuscript.

      On page 18 the authors state that: "The lower SS rate was attributed to intrinsic factors of PNs, while the reduced frequency of CSs was speculated to result from increased inhibition of the IO via the nucleo-olivary (N-O) pathway targeting the same microzone." I think I understand what you mean to say, but this is a bit confusing.

      Agreed. We will rephrase this sentence to clarify that a lower SS rate in a given microzone may lead to increased activation of inhibitory N-O axons that target the region of IO that sends CF to the same microzone.

      Is airpuff stimulation not more likely to activate PO dan DAO because of the related modalities (more face vs. more trunk/limbs?), and thereby also more likely to drive event co-activation (as it is stated in the abstract).

      We agree that the specific innervation patterns of different IO regions likely explain the discrepancy between previous reports of airpuff-evoked complex spikes in cerebellar cortical regions targeted by DAO and the absence of airpuff responses in the particular region of DAO accessible via our surgical approach. As in the present dataset virtually no airpuff-evoked events were seen in DAO regions, we are unable to directly compare airpuff-evoked event co-activation between PO and DAO. The higher co-activation for PO was observed for "spontaneous" activity.

      The Discussion addresses the question of why N-O pathway activation does not remove the airpuff response.

      Given the potentially profound effect, I would propose to expand the discussion on the role of aneasthesia, including longer refractory periods but also potential disruption of normal network interactions (even though individually the stimulations work). Briefly indicating what is known about alpha-chloralose would help interpret the results as well.

      We fully agree that the anesthetic state introduces confounding factors that must be considered when interpreting our results. We will expand the discussion to address how anesthesia, particularly alphachloralose as well as tissue cooling, may contribute to prolonged refractory periods and potential disruptions in normal network interactions. However, we recognize that certain aspects cannot be fully resolved without recordings in awake animals. For this reason, we characterize our preparation as an "upgraded" in vitro approach rather than a fully representative in vivo model.

      Please clearly indicate that the age range of P35-45 is for the moment of virus injection and specify the age range for the imaging experiment.

      Apologies for the oversight. We will indicate these age ranges in the results (as they are currently only specified in Methods). The P35-45 range refers to moment of virus injection.

      The methods indicate that a low-pass filter of 1Hz was used. I am sure this helps with smoothing, but does it not remove a lot of potentially interesting information. How would a higher low-pass filter affect the analysis and results?

      We acknowledge that applying a 1 Hz low-pass filter inevitably removes high-frequency components, including potential IO oscillations and fine details such as spike "doublets." However, given the temporal resolution constraints of our recording approach, we prioritized capturing robust, interpretable events over attempting to extract finer features that might be obscured by both the indicator kinetics and imaging speed.

      While a higher cut-off frequency could, in principle, allow more precise measurement of rise times and peak timings, it would also amplify high-frequency noise, complicating automated event detection and reducing confidence in distinguishing genuine neural signals from artifacts. Given these trade-offs, we opted for a conservative filtering approach to ensure stable event detection. Future work, particularly with faster imaging rates and improved sensors (GCaMP8s) will be used to explore the finer temporal structure of IO activity. We will deliberate on these matters more extensively in the revised discussion.

      Reviewer #2 (Public review):

      The authors developed a strategy to image inferior olive somata via viral GCaMP6s expression, an implanted GRIN lens, and a one-photon head-mounted microscope, providing the first in vivo somatic recordings from these neurons. The main new findings relate to the activation of the nucleoolivary pathway, specifically that: this manipulation does not produce a spiking rebound in the IO; it exerts a larger effect on spontaneous IO spiking than stimulus (airpuff)-evoked spiking. In addition, several findings previously demonstrated in vivo in Purkinje cell complex spikes or inferior olivary axons are confirmed here in olivary somata: differences in event sizes from single cells versus co-activated cells; reduced coactivation when activating the NO pathway; more coactivation within a single zebrin compartment.

      The study presents some interesting findings, and for the most part, the analyses are appropriate. My two principal critiques are that the study does not acknowledge major technical limitations and their impact on the claims; and the study does not accurately represent prior work with respect to the current findings.

      We thank the reviewer for recognising the value of the findings in our "reduced" in vivo preparation, and apologize for omissions in the work that led to critique. We will elaborate on these matters below and prepare a revised manuscript.

      The authors use GCaMP6s, which has a tau1/2 of >1 s for a normal spike, and probably closer to 2 s (10.1038/nature12354) for the unique and long type of olivary spikes that give rise to axonal bursts (10.1016/j.neuron.2009.03.023). Indeed, the authors demonstrate as much (Fig. 2B1). This affects at least several claims:

      a. The authors report spontaneous spike rates of 0.1 Hz. They attribute this to anesthesia, yet other studies under anesthesia recording Purkinje complex spikes via either imaging or electrophysiology report spike rates as high as 1.5 Hz (10.1523/JNEUROSCI.2525-10.2011). This discrepancy is not acknowledged and a plausible explanation is not given. Citations are not provided that demonstrate such low anesthetized spike rates, nor are citations provided for the claim that spike rates drop increasingly with increasing levels of anesthesia when compared to awake resting conditions.

      We fully acknowledge that anesthesia is a major confounding factor in our study. Given the unusually invasive nature of our surgical preparation, we prioritized deep anesthesia to ensure the animals’ welfare. This, along with potential cooling effects from tissue removal and GRIN lens contact, likely contributed to the observed suppression of IO activity.

      We recognize that reported complex spike rates under anesthesia vary considerably across studies, and we will expand our discussion to provide a more comprehensive comparison with prior literature. Notably, different anesthetic protocols, levels of anesthesia, and recording methodologies can lead to widely different estimates of firing rates. While we cannot resolve this issue without recordings in awake animals, we will clarify that our observed rates likely reflect both the effects of anesthesia and specific methodological constraints. We will also incorporate additional references to studies examining cerebellar activity under different anesthetic conditions.

      More likely, this discrepancy reflects spikes that are missed due to a combination of the indicator kinetics and low imaging sensitivity (see (2)), neither of which are presented as possible plausible alternative explanations.

      We acknowledge that the combination of slow indicator kinetics and limited optical power in our miniature microscope setup constrains the temporal resolution of our recordings. However, we are confident that we can reliably detect events occurring at intervals of 1 second or longer. This confidence is based on data from another preparation using the same viral vector and optical system, where we observed spike rates an order of magnitude higher.

      That said, we do not make claims regarding the presence or absence of somatic events occurring at very short intervals (e.g., 100-ms "doublets," as described by Titley et al., 2019), as these would likely fall below our temporal resolution. We will clarify this limitation in the revised manuscript to ensure that the constraints of our approach are fully acknowledged.

      While GCaMP6s is not as sensitive as more recent variants (Zhang et al., 2023, PMID 36922596), our previous work (Dorgans et al., 2022) demonstrated that its dynamic range and sensitivity are sufficient to detect both spikes and subthreshold activity in vitro. Although the experimental conditions differ in the current miniscope experiments, we took measures to optimize signal quality, including excluding recordings with a low signal-to-noise ratio (see Methods). This need for high signal fidelity also informed our decision to limit the sampling rate to 20 fps. In future work, we plan to adopt newer GCaMP variants that were not available at the start of this project, which should further improve sensitivity and temporal resolution.

      Many claims are made throughout about co-activation ("clustering"), but with the GCaMP6s rise time to peak (0.5 s), there is little technical possibility to resolve co-activation. This limitation is not acknowledged as a caveat and the implications for the claims are not engaged with in the text.

      As noted in the manuscript (L492-), "interpreting fluorescence signals relative to underlying voltage changes is challenging, particularly in IO neurons with unusual calcium dynamics." We acknowledge that the slow rise time of GCaMP6s ( 0.5 s) limits our ability to precisely resolve the timing of co-activation at very short intervals. However, given the relatively slow timescales of IO event clustering and the inherent synchrony in olivary network dynamics, we believe that the observed co-activation patterns remain meaningful, even if finer temporal details cannot be fully resolved.

      To ensure clarity, we will expand this section to explicitly acknowledge the temporal resolution limitations of our approach and discuss their implications for interpreting co-activation. While the precise timing of individual spikes within a cluster may not be resolvable, the observed increase in event magnitude with coarse co-activation suggests that clustering effects remain functionally relevant even when exact spike synchrony is not detectable at millisecond resolution.

      This finding is consistent with the idea that co-activation enhances calcium influx, leading to larger amplitude events — a relationship that does not require perfect temporal resolution to be observed. The fact that this effect persists across a broad range of clustering windows (as shown in Figure 2 Supplement 2) further supports its robustness. While we cannot make strong claims about precise spike timing within these clusters nor about the mechanism underlying enhanced calcium signal, our results demonstrate that co-activation may influence IO activity in a quantifiable way. We will clarify these points in the revised manuscript to ensure that our findings are appropriately framed given the temporal constraints of our imaging approach.

      The study reports an ultralong "refractory period" (L422-etc) in the IO, but this again must be tempered by the possibility that spikes are simply being missed due to very slow indicator kinetics and limited sensitivity. Indeed, the headline numeric estimate of 1.5 s (L445) is suspiciously close to the underlying indicator kinetic limitation of 1-2 s.

      Our findings suggest a potential refractory period limiting the frequency of events in the inferior olive under our recording conditions. This interpretation is supported by the observed inter-event interval distribution, the inability of N-O stimulation to suppress airpuff-evoked events, and lower bounds reported in earlier literature on complex spike intervals recorded in awake animals under various behavioral contexts. Taking into account the likely cooling of tissue, a refractory period of 1.5s is not unreasonable. Of course, we recognize that the slow decay kinetics of GCaMP6s may cause overlapping fluorescence signals, potentially obscuring closely spaced events. This is in line with data presented in the Chen et al 2013 manuscript describing GCaMp6s (PMID: 36922596; Figure 3b showing events detected with intervals less than 500 ms).

      The consideration of refractoriness only arose late in the project while we were investigating the explanations for lack of inhibition of airpuff-evoked spikes. Future experiments, particularly in awake animals, will be instrumental in validating this interpretation. To ensure that the refractory period is understood as one possible mechanism rather than a definitive explanation, we will rephrase the discussion to clarify that while our data are compatible with a refractory period, they do not establish it conclusively.

      The study uses endoscopic one-photon miniaturized microscope imaging. Realistically, this is expected to permit an axial point spread function (z-PSF) on the order of 40um, which must substantially reduce resolution and sensitivity. This means that if there *is* local coactivation, the data in this study will very likely have individual ROIs that integrate signals from multiple neighboring cells. The study reports relationships between event magnitude and clustering, etc; but a fluorescence signal that contains photons contributed by multiple neighboring neurons will be larger than a single neuron, regardless of the underlying physiology - the text does not acknowledge this possibility or limitation.

      We acknowledge that the use of one-photon endoscopic imaging imposes limitations on axial resolution, potentially leading to signal contributions from neighboring neurons. To mitigate this, we applied CNMFe processing, which allows for the deconvolution of overlapping signals and the differentiation of multiple neuronal sources within shared pixels. However, as the reviewer points out, if two neurons are perfectly overlapping in space, they may be treated as a single unit.

      To clarify this limitation, we will expand the discussion to explicitly acknowledge the impact of one-photon imaging on signal separation and to emphasize that, while CNMFe helps resolve some overlaps, perfect separation is not always possible. As already noted in the manuscript (L495-), "the absence of optical sectioning in the whole-field imaging method can lead to confounding artifacts in densely labeled structures such as the IO’s tortuous neuropil." We will further elaborate on how this factor was considered in our analysis and interpretation.

      Second, the text makes several claims for the first multicellular in vivo olivary recordings. (L11; L324, etc).

      I am aware of at least two studies that have recorded populations of single olivary axons using two-photon Ca2+ imaging up to 6 years ago (10.1016/j.neuron.2019.03.010; 10.7554/eLife.61593). This technique is not acknowledged or discussed, and one of these studies is not cited. No argument is presented for why axonal imaging should not "count" as multicellular in vivo olivary recording: axonal Ca2+ reflects somatic spiking.

      We appreciate the reviewer’s point and acknowledge the important prior work using two-photon imaging to record olivary axonal activity in the cerebellar cortex. However, while axonal calcium signals do reflect somatic spiking, these recordings inherently lack information about the local network interactions within the inferior olive itself.

      A key motivation for our study was to observe neuronal activity within the IO at the level of its gap-junctioncoupled local circuits, rather than at the level of its divergent axonal outputs. The fan-like spread of climbing fibers across rostrocaudal microzones in the cerebellar cortex makes them relatively easy to record in vivo, but it also means that individual imaging fields contain axons from neurons that may be distributed across different IO microdomains. As a result, while previous work has provided valuable insight into olivary output patterns, it has not allowed for the examination of coordinated somatic activity within localized IO neuron clusters.

      With apologies, we recognize that this distinction was not sufficiently emphasized in our introduction. We will clarify this key point and ensure that the important climbing fiber imaging studies are properly cited and contextualized in the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      The authors state: "we found no reports that examined coactivation levels between Z+ and Z- microzones in cerebellar complex spike recordings" (L359). Multiple papers (that are not cited) using AldolaceC-tdTomato mice with two photon Purkinje dendritic calcium imaging showed synchronization (at similar levels) within but not across z+/z- bands. (2015 10.1523/JNEUROSCI.2170-14.2015, 2023 https://doi.org/10.7554/eLife.86340).

      We apologize for the misleading phrasing. We will rephrase this statement to: "While complex spike coactivation within individual zebrin zones has been extensively studied (references), we found no reports directly comparing the levels of intra-zone co-activation between Z+ and Z microzones."

      Additionally, we will ensure that the relevant studies demonstrating synchronization within zebrin zones, as well as (lack of) interactions between neighboring zones, are properly cited and discussed in the revised manuscript.

      The figures could use more proofreading, and several decisions should be reconsidered:

      Normalizing the amplitude to maximum is not a good strategy, as it can overemphasize noise or extremely small-magnitude signals, and should instead follow standard convention and present in fixed units (3A2, 4B2, and even 2C).

      As noted earlier, we have excluded recordings and cells with high noise or a low signal-to-noise ratio for event amplitudes, ensuring that such data do not influence the color-coded panels. Importantly, all quantitative analyses and traces presented in the manuscript are normalized to baseline noise level, not to maximal amplitude, ensuring that noise or low-magnitude signals do not skew the analysis.

      The decision to use max-amplitude normalization in color-coded panels was made specifically to aid visualization of temporal structure across recordings. This approach allows for clearer comparisons without the distraction of inter-cell variability in absolute signal strength. However, we recognize the potential for confusion and will revise the Results text to explicitly clarify that the color-coded visualizations use a different scaling method than the quantitative analyses.

      x axes with no units: Figures 2B2, 2E1, 3B2, 3C2, 5B2, 5C2, 5D2.

      No colorbar units: 5A3 (and should be shown in real not normalized units).

      No y axis units: 5D1.

      No x axis label or units: 5E1.

      5E3 says "stim/baseline" for the y-axis units and then the first-panel title says "absolute frequencies" meaning it’s *not* normalized and needs a separate (accurate) y-axis with units.

      Illegibly tiny fonts: 2E1, 3E1, etc.

      We will correct all these in the revised manuscript. Thank you for careful reading.

    1. Kernel Data Structures

      Kernel data structures maintain essential state information about I/O activities, ensuring efficient system operation. UNIX’s open-file table, for example, tracks file descriptors, file system records, and active inodes, streamlining file management. Similar structures exist for network connections and character devices. Object-oriented approaches further enhance modularity, as seen in UNIX’s dispatch tables and Windows’ message-passing system. In Windows, I/O requests are encapsulated as messages, enabling flexible interactions between the kernel, I/O manager, and device drivers. Although this approach introduces additional processing overhead, it simplifies I/O management and enhances system flexibility. These structured methods ensure that operating systems can efficiently track, manage, and process diverse I/O operations, leading to more stable and efficient computing environments.

    2. Spooling and Device Reservation A spool is a buffer that holds output for a device, such as a printer, that cannot accept interleaved data streams. Although a printer can serve only one job at a time, several applications may wish to print their output concurrently, without having their output mixed together. The operating system solves this problem by intercepting all output to the printer. Each application's output is spooled to a separate secondary storage file. When an application finishes printing, the spooling system queues the corresponding spool file for output to the printer. The spooling system copies the queued spool files to the printer one at a time. In some operating systems, spooling is managed by a system daemon process. In others, it is handled by an in-kernel thread. In either case, the operating system provides a control interface that enables users and system administrators to display the queue, remove unwanted jobs before those jobs print, suspend printing while the printer is serviced, and so on. Some devices, such as tape drives and printers, cannot usefully multiplex the I/O requests of multiple concurrent applications. Spooling is one way operating systems can coordinate concurrent output. Another way to deal with concurrent device access is to provide explicit facilities for coordination. Some operating systems (including VMS) provide support for exclusive device access by enabling a process to allocate an idle device and to deallocate that device when it is no longer needed. Other operating systems enforce a limit of one open file handle to such a device. Many operating systems provide functions that enable processes to coordinate exclusive access among themselves. For instance, Windows provides system calls to wait until a device object becomes available. It also has a parameter to the OpenFile() system call that declares the types of access to be permitted to other concurrent threads. On these systems, it is up to the applications to avoid deadlock.

      Spooling is a technique used to manage output devices that cannot handle interleaved data streams, such as printers. Instead of sending data directly to the device, the system stores it in a spool—a designated buffer in secondary storage—ensuring orderly processing. A spooler daemon manages the queue, allowing users to monitor and manipulate print jobs. Additionally, some devices require exclusive access to prevent conflicts. Device reservation mechanisms prevent multiple processes from simultaneously accessing non-shareable devices like tape drives. Operating systems enforce these constraints through explicit allocation requests or file-handle restrictions, ensuring coordinated access. In environments like Windows, system calls allow processes to wait for device availability, preventing deadlocks. Effective spooling and reservation strategies optimize device utilization and maintain orderly, conflict-free access to critical system resources.

    3. Because the performance and addressing characteristics of network I/O differ significantly from those of disk I/O, most operating systems provide a network I/O interface that is different from the read()–write()–seek() interface used for disks. One interface available in many operating systems, including UNIX and Windows, is the network socket interface. Think of a wall socket for electricity: any electrical appliance can be plugged in. By analogy, the system calls in the socket interface enable an application to create a socket, to connect a local socket to a remote address (which plugs this application into a socket created by another application), to listen for any remote application to plug into the local socket, and to send and receive packets over the connection. To support the implementation of network servers, the socket interface also provides a function called select() that manages a set of sockets. A call to select() returns information about which sockets have a packet waiting to be received and which sockets have room to accept a packet to be sent. The use of select() eliminates the polling and busy waiting that would otherwise be necessary for network I/O. These functions encapsulate the essential behaviors of networks, greatly facilitating the creation of distributed applications that can use any underlying network hardware and protocol stack. Many other approaches to interprocess communication and network communication have been implemented. For instance, Windows provides one interface to the network interface card and a second interface to the network protocols. In UNIX, which has a long history as a proving ground for network technology, we find half-duplex pipes, full-duplex FIFOs, full-duplex STREAMS, message queues, and sockets. Information on UNIX networking is given in Section C.9.

      Due to distinct performance and addressing characteristics, network I/O employs a different interface than disk I/O. Many operating systems, including UNIX and Windows, implement the socket interface for network communication. The socket system calls enable applications to create, connect, listen, send, and receive data over network connections. The select() function is highlighted for managing multiple sockets efficiently, allowing applications to detect readable or writable sockets without continuous polling. Network communication in UNIX is further explored, showcasing diverse interprocess communication methods such as pipes, FIFOs, message queues, and STREAMS. The section underscores the importance of encapsulating network behaviors through standardized interfaces, simplifying the development of distributed applications that function across different network hardware and protocol stacks.

    4. 8.4 Methods for Handling Deadlocks

      There are three general approaches to handling deadlocks:

      1. Ignoring Deadlocks: Many operating systems, including Linux and Windows, do not implement specific deadlock-handling mechanisms. Instead, they leave the responsibility to kernel and application developers to prevent and resolve deadlocks.

      2. Deadlock Prevention or Avoidance: This approach ensures that a system never enters a deadlocked state. Deadlock prevention works by ensuring that at least one of the necessary conditions for deadlocks is eliminated. Deadlock avoidance requires prior knowledge of resource requests to make decisions that prevent circular waits.

      3. Deadlock Detection and Recovery: Some systems, such as databases, allow deadlocks to occur and then implement algorithms to detect and recover from them. If no detection and recovery mechanisms exist, the system may deteriorate until manual intervention is required.

      Each approach has its own advantages and drawbacks. Ignoring deadlocks is cost-effective but risks system failure. Prevention and avoidance require additional constraints on resource requests, which may reduce system efficiency. Detection and recovery offer flexibility but may introduce significant computational overhead.

    5. 4.7.1 Windows Threads

      Windows threads follow a one-to-one mapping, where each user-level thread corresponds to a kernel thread. Every thread has essential components, including a thread ID, program counter, register set, and user/kernel stacks. Windows threads are represented by key data structures: ETHREAD (executive thread block), KTHREAD (kernel thread block), and TEB (thread environment block). The ETHREAD links the thread to its parent process, while KTHREAD manages scheduling and kernel stack usage. The TEB resides in user space, containing thread-local storage and other user-mode data. Since ETHREAD and KTHREAD exist in kernel space, only the operating system can access them. This structured approach ensures efficient thread management, optimizing performance in multithreaded Windows applications.

    6. 4.6 Threading Issues In this section, we discuss some of the issues to consider in designing multithreaded programs. 4.6.1 The fork() and exec() System Calls In Chapter 3, we described how the fork() system call is used to create a separate, duplicate process. The semantics of the fork() and exec() system calls change in a multithreaded program. If one thread in a program calls fork(), does the new process duplicate all threads, or is the new process single-threaded? Some UNIX systems have chosen to have two versions of fork(), one that duplicates all threads and another that duplicates only the thread that invoked the fork() system call. The exec() system call typically works in the same way as described in Chapter 3. That is, if a thread invokes the exec() system call, the program specified in the parameter to exec() will replace the entire process—including all threads. Which of the two versions of fork() to use depends on the application. If exec() is called immediately after forking, then duplicating all threads is unnecessary, as the program specified in the parameters to exec() will replace the process. In this instance, duplicating only the calling thread is appropriate. If, however, the separate process does not call exec() after forking, the separate process should duplicate all threads. 4.6.2 Signal Handling A signal is used in UNIX systems to notify a process that a particular event has occurred. A signal may be received either synchronously or asynchronously, depending on the source of and the reason for the event being signaled. All signals, whether synchronous or asynchronous, follow the same pattern: 1. A signal is generated by the occurrence of a particular event. 2. The signal is delivered to a process. 3. Once delivered, the signal must be handled. Examples of synchronous signals include illegal memory access and division by 0. If a running program performs either of these actions, a signal is generated. Synchronous signals are delivered to the same process that performed the operation that caused the signal (that is the reason they are considered synchronous). When a signal is generated by an event external to a running process, that process receives the signal asynchronously. Examples of such signals include terminating a process with specific keystrokes (such as <control><C>) and having a timer expire. Typically, an asynchronous signal is sent to another process. A signal may be handled by one of two possible handlers: 1. A default signal handler 2. A user-defined signal handler Every signal has a default signal handler that the kernel runs when handling that signal. This default action can be overridden by a user-defined signal handler that is called to handle the signal. Signals are handled in different ways. Some signals may be ignored, while others (for example, an illegal memory access) are handled by terminating the program. Handling signals in single-threaded programs is straightforward: signals are always delivered to a process. However, delivering signals is more complicated in multithreaded programs, where a process may have several threads. Where, then, should a signal be delivered? In general, the following options exist: 1. Deliver the signal to the thread to which the signal applies. 2. Deliver the signal to every thread in the process. 3. Deliver the signal to certain threads in the process. 4. Assign a specific thread to receive all signals for the process. The method for delivering a signal depends on the type of signal generated. For example, synchronous signals need to be delivered to the thread causing the signal and not to other threads in the process. However, the situation with asynchronous signals is not as clear. Some asynchronous signals—such as a signal that terminates a process (<control><C>, for example)—should be sent to all threads. The standard UNIX function for delivering a signal is kill(pid_t pid, int signal) This function specifies the process (pid) to which a particular signal (signal) is to be delivered. Most multithreaded versions of UNIX allow a thread to specify which signals it will accept and which it will block. Therefore, in some cases, an asynchronous signal may be delivered only to those threads that are not blocking it. However, because signals need to be handled only once, a signal is typically delivered only to the first thread found that is not blocking it. POSIX Pthreads provides the following function, which allows a signal to be delivered to a specified thread (tid): pthread_kill(pthread_t tid, int signal) Although Windows does not explicitly provide support for signals, it allows us to emulate them using asynchronous procedure calls (APCs). The APC facility enables a user thread to specify a function that is to be called when the user thread receives notification of a particular event. As indicated by its name, an APC is roughly equivalent to an asynchronous signal in UNIX. However, whereas UNIX must contend with how to deal with signals in a multithreaded environment, the APC facility is more straightforward, since an APC is delivered to a particular thread rather than a process. 4.6.3 Thread Cancellation Thread cancellation involves terminating a thread before it has completed. For example, if multiple threads are concurrently searching through a database and one thread returns the result, the remaining threads might be canceled. Another situation might occur when a user presses a button on a web browser that stops a web page from loading any further. Often, a web page loads using several threads—each image is loaded in a separate thread. When a user presses the stop button on the browser, all threads loading the page are canceled. A thread that is to be canceled is often referred to as the target thread. Cancellation of a target thread may occur in two different scenarios: 1. Asynchronous cancellation. One thread immediately terminates the target thread. 2. Deferred cancellation. The target thread periodically checks whether it should terminate, allowing it an opportunity to terminate itself in an orderly fashion. The difficulty with cancellation occurs in situations where resources have been allocated to a canceled thread or where a thread is canceled while in the midst of updating data it is sharing with other threads. This becomes especially troublesome with asynchronous cancellation. Often, the operating system will reclaim system resources from a canceled thread but will not reclaim all resources. Therefore, canceling a thread asynchronously may not free a necessary system-wide resource. With deferred cancellation, in contrast, one thread indicates that a target thread is to be canceled, but cancellation occurs only after the target thread has checked a flag to determine whether or not it should be canceled. The thread can perform this check at a point at which it can be canceled safely. In Pthreads, thread cancellati

      Multithreaded programs face several challenges, including handling system calls like fork() and exec(). The behavior of fork() varies—some UNIX implementations duplicate all threads, while others duplicate only the calling thread. If exec() is called immediately after forking, duplicating all threads is unnecessary. Signal handling is another challenge, as signals can be synchronous (e.g., division by zero) or asynchronous (e.g., termination signals). Signals in multithreaded programs can be delivered to a specific thread, all threads, or certain threads based on signal type. POSIX Pthreads provide pthread_kill() to direct signals to a specific thread. Windows uses Asynchronous Procedure Calls (APCs) to handle event-driven notifications, similar to UNIX signals. Developers must ensure proper signal handling to avoid unintended behavior. Thread cancellation is another concern, requiring careful implementation to prevent resource leaks and ensure that terminated threads do not leave operations incomplete.

    7. 4.5.5 Intel Thread Building Blocks Intel threading building blocks (TBB) is a template library that supports designing parallel applications in C++. As this is a library, it requires no special compiler or language support. Developers specify tasks that can run in parallel, and the TBB task scheduler maps these tasks onto underlying threads. Furthermore, the task scheduler provides load balancing and is cache aware, meaning that it will give precedence to tasks that likely have their data stored in cache memory and thus will execute more quickly. TBB provides a rich set of features, including templates for parallel loop structures, atomic operations, and mutual exclusion locking. In addition, it provides concurrent data structures, including a hash map, queue, and vector, which can serve as equivalent thread-safe versions of the C++ standard template library data structures. Let's use parallel for loops as an example. Initially, assume there is a function named apply(float value) that performs an operation on the parameter value. If we had an array v of size n containing float values, we could use the following serial for loop to pass each value in v to the apply() function: for (int i = 0; i < n; i++) {   apply(v[i]); } A developer could manually apply data parallelism (Section 4.2.2) on a multicore system by assigning different regions of the array v to each processing core; however, this ties the technique for achieving parallelism closely to the physical hardware, and the algorithm would have to be modified and recompiled for the number of processing cores on each specific architecture. Alternatively, a developer could use TBB, which provides a parallel_for template that expects two values: parallel_for (range   body) where range refers to the range of elements that will be iterated (known as the iteration space) and body specifies an operation that will be performed on a subrange of elements. We can now rewrite the above serial for loop using the TBB parallel_for template as follows: parallel_for (size_t(0), n, [=](size_t i) {apply(v[i]);}); The first two parameters specify that the iteration space is from 0 to n − 1 (which corresponds to the number of elements in the array v). The second parameter is a C++ lambda function that requires a bit of explanation. The expression [=](size_t i) is the parameter i, which assumes each of the values over the iteration space (in this case from 0 to n − 1). Each value of i is used to identify which array element in v is to be passed as a parameter to the apply(v[i]) function. The TBB library will divide the loop iterations into separate “chunks” and create a number of tasks that operate on those chunks. (The parallel_for function allows developers to manually specify the size of the chunks if they wish to.) TBB will also create a number of threads and assign tasks to available threads. This is quite similar to the fork-join library in Java. The advantage of this approach is that it requires only that developers identify what operations can run in parallel (by specifying a parallel_for loop), and the library manages the details involved in dividing the work into separate tasks that run in parallel. Intel TBB has both commercial and open-source versions that run on Windows, Linux, and macOS. Refer to the bibliography for further details on how to develop parallel applications using TBB.

      Intel TBB is a powerful template library for parallel programming in C++. Unlike OpenMP, it does not require compiler support but instead provides a task-based approach for parallel execution. The TBB task scheduler dynamically maps tasks to threads, optimizing load balancing and cache efficiency. The parallel_for template automates parallelization of loops, ensuring efficient distribution of workload. This method abstracts hardware-specific optimizations, making it adaptable across different multicore systems. TBB also provides concurrent data structures such as thread-safe hash maps and vectors, enhancing performance in multithreaded applications. The flexibility of TBB allows developers to scale applications efficiently without modifying code for different hardware configurations. While TBB offers advanced features for parallelism, it requires a solid understanding of lambda functions and task dependencies to maximize performance. Proper use of TBB ensures improved computational efficiency in complex applications.

    8. THE JVM AND THE HOST OPERATING SYSTEM

      The Java Virtual Machine (JVM) operates on top of a host OS, abstracting platform-specific threading details. The JVM does not dictate how Java threads map to OS threads, leaving it to individual implementations. Windows, for example, employs a one-to-one threading model, meaning each Java thread corresponds to a kernel thread. This adaptability allows Java applications to run consistently across different OS environments, utilizing native threading libraries such as Windows API or Pthreads on Linux/macOS.

    9. Thread Libraries

      A thread library provides an API for creating and managing threads, allowing developers to handle multithreading efficiently. These libraries can be implemented at either the user level or the kernel level.

      A user-level thread library operates in user space, avoiding system calls for better performance. However, if a thread makes a blocking system call, all threads in the process may be blocked.

      A kernel-level thread library is supported directly by the operating system. While slightly slower due to system calls, it allows better thread management and parallel execution.

      Some of the most widely used thread libraries include POSIX Pthreads (common in UNIX/Linux systems), Windows Threads, and Java Threads, which rely on the underlying OS for execution.

    10. The one-to-one model (Figure 4.8) maps each user thread to a kernel thread. It provides more concurrency than the many-to-one model by allowing another thread to run when a thread makes a blocking system call. It also allows multiple threads to run in parallel on multiprocessors. The only drawback to this model is that creating a user thread requires creating the corresponding kernel thread, and a large number of kernel threads may burden the performance of a system. Linux, along with the family of Windows operating systems, implement the one-to-one model.

      In the one-to-one model, each user thread is directly mapped to a kernel thread. This model improves concurrency because if one thread blocks, others can continue execution. Furthermore, it supports true parallelism, allowing threads to run on multiple processors. Despite these advantages, the model comes with high overhead. Creating a user thread also requires creating a corresponding kernel thread, which can lead to resource exhaustion if too many threads are created.

    11. 3.8.2 Remote Procedure Calls One of the most common forms of remote service is the RPC paradigm, which was designed as a way to abstract the procedure-call mechanism for use between systems with network connections. It is similar in many respects to the IPC mechanism described in Section 3.4, and it is usually built on top of such a system. Here, however, because we are dealing with an environment in which the processes are executing on separate systems, we must use a message-based communication scheme to provide remote service. In contrast to IPC messages, the messages exchanged in RPC communication are well structured and are thus no longer just packets of data. Each message is addressed to an RPC daemon listening to a port on the remote system, and each contains an identifier specifying the function to execute and the parameters to pass to that function. The function is then executed as requested, and any output is sent back to the requester in a separate message. A port in this context is simply a number included at the start of a message packet. Whereas a system normally has one network address, it can have many ports within that address to differentiate the many network services it supports. If a remote process needs a service, it addresses a message to the proper port. For instance, if a system wished to allow other systems to be able to list its current users, it would have a daemon supporting such an RPC attached to a port—say, port 3027. Any remote system could obtain the needed information (that is, the list of current users) by sending an RPC message to port 3027 on the server. The data would be received in a reply message. The semantics of RPCs allows a client to invoke a procedure on a remote host as it would invoke a procedure locally. The RPC system hides the details that allow communication to take place by providing a stub on the client side. Typically, a separate stub exists for each separate remote procedure. When the client invokes a remote procedure, the RPC system calls the appropriate stub, passing it the parameters provided to the remote procedure. This stub locates the port on the server and marshals the parameters. The stub then transmits a message to the server using message passing. A similar stub on the server side receives this message and invokes the procedure on the server. If necessary, return values are passed back to the client using the same technique. On Windows systems, stub code is compiled from a specification written in the Microsoft Interface Definition Language (MIDL), which is used for defining the interfaces between client and server programs. Parameter marshaling addresses the issue concerning differences in data representation on the client and server machines. Consider the representation of 32-bit integers. Some systems (known as big-endian) store the most significant byte first, while other systems (known as little-endian) store the least significant byte first. Neither order is “better” per se; rather, the choice is arbitrary within a computer architecture. To resolve differences like this, many RPC systems define a machine-independent representation of data. One such representation is known as external data representation (XDR). On the client side, parameter marshaling involves converting the machine-dependent data into XDR before they are sent to the server. On the server side, the XDR data are unmarshaled and converted to the machine-dependent representation for the server. Another important issue involves the semantics of a call. Whereas local procedure calls fail only under extreme circumstances, RPCs can fail, or be duplicated and executed more than once, as a result of common network errors. One way to address this problem is for the operating system to ensure that messages are acted on exactly once, rather than at most once. Most local procedure calls have the “exactly once” functionality, but it is more difficult to implement. First, consider “at most once.” This semantic can be implemented by attaching a timestamp to each message. The server must keep a history of all the timestamps of messages it has already processed or a history large enough to ensure that repeated messages are detected. Incoming messages that have a timestamp already in the history are ignored. The client can then send a message one or more times and be assured that it only executes once. For “exactly once,” we need to remove the risk that the server will never receive the request. To accomplish this, the server must implement the “at most once” protocol described above but must also acknowledge to the client that the RPC call was received and executed. These ACK messages are common throughout networking. The client must resend each RPC call periodically until it receives the ACK for that call. Yet another important issue concerns the communication between a server and a client. With standard procedure calls, some form of binding takes place during link, load, or execution time (Chapter 9) so that a procedure call's name is replaced by the memory address of the procedure call. The RPC scheme requires a similar binding of the client and the server port, but how does a client know the port numbers on the server? Neither system has full information about the other, because they do not share memory. Two approaches are common. First, the binding information may be predetermined, in the form of fixed port addresses. At compile time, an RPC call has a fixed port number associated with it. Once a program is compiled, the server cannot change the port number of the requested service. Second, binding can be done dynamically by a rendezvous mechanism. Typically, an operating system provides a rendezvous (also called a matchmaker) daemon on a fixed RPC port. A client then sends a message containing the name of the RPC to the rendezvous daemon requesting the port address of the RPC it needs to execute. The port number is returned, and the RPC calls can be sent to that port until the process terminates (or the server crashes). This method requires the extra overhead of the initial request but is more flexible than the first approach. Figure 3.29 shows a sample interaction.

      Remote Procedure Calls (RPCs) abstract the communication process between distributed systems by allowing a client to invoke procedures on a remote machine as if they were local functions. Unlike raw socket communication, which requires applications to structure their own messages, RPCs handle function calls transparently. A key component of RPCs is parameter marshaling, which ensures data compatibility between different architectures (e.g., big-endian vs. little-endian). Additionally, RPC implementations must handle network failures and duplicate execution risks, requiring techniques such as timestamping and acknowledgment messages. The use of a matchmaker service for dynamic binding enhances flexibility, allowing clients to locate available RPC services at runtime rather than relying on fixed port assignments.

    12. 3.7.4.2 Named Pipes Ordinary pipes provide a simple mechanism for allowing a pair of processes to communicate. However, ordinary pipes exist only while the processes are communicating with one another. On both UNIX and Windows systems, once the processes have finished communicating and have terminated, the ordinary pipe ceases to exist. Named pipes provide a much more powerful communication tool. Communication can be bidirectional, and no parent–child relationship is required. Once a named pipe is established, several processes can use it for communication. In fact, in a typical scenario, a named pipe has several writers. Additionally, named pipes continue to exist after communicating processes have finished. Both UNIX and Windows systems support named pipes, although the details of implementation differ greatly. Next, we explore named pipes in each of these systems. Named pipes are referred to as FIFOs in UNIX systems. Once created, they appear as typical files in the file system. A FIFO is created with the mkfifo() system call and manipulated with the ordinary open(), read(), write(), and close() system calls. It will continue to exist until it is explicitly deleted from the file system. Although FIFOs allow bidirectional communication, only half-duplex transmission is permitted. If data must travel in both directions, two FIFOs are typically used. Additionally, the communicating processes must reside on the same machine. If intermachine communication is required, sockets (Section 3.8.1) must be used. Named pipes on Windows systems provide a richer communication mechanism than their UNIX counterparts. Full-duplex communication is allowed, and the communicating processes may reside on either the same or different machines. Additionally, only byte-oriented data may be transmitted across a UNIX FIFO, whereas Windows systems allow either byte- or message-oriented data. Named pipes are created with the CreateNamedPipe() function, and a client can connect to a named pipe using ConnectNamedPipe(). Communication over the named pipe can be accomplished using the ReadFile() and WriteFile() functions.

      Windows systems implement ordinary pipes, called anonymous pipes, with the CreatePipe() function. The difference from UNIX is that Windows requires explicit handling of pipe inheritance for child processes. The parent creates the pipe, and when invoking CreateProcess(), the parent's pipe handles must be passed to the child. This is achieved by manipulating the STARTUPINFO structure to redirect the child’s standard input and output to the pipe’s read and write ends.

      Once the child process is created, the parent writes to the pipe using WriteFile(), and the child reads from the pipe using ReadFile(). After communication, both ends of the pipe are closed. The main advantage of Windows anonymous pipes is the ability to explicitly control the inheritance of pipe handles, ensuring that the child only has access to the required pipe ends. This approach facilitates secure and isolated communication between the parent and child processes.

    13. 3.7.3 Windows The Windows operating system is an example of modern design that employs modularity to increase functionality and decrease the time needed to implement new features. Windows provides support for multiple operating environments, or subsystems. Application programs communicate with these subsystems via a message-passing mechanism. Thus, application programs can be considered clients of a subsystem server. The message-passing facility in Windows is called the advanced local procedure call (ALPC) facility. It is used for communication between two processes on the same machine. It is similar to the standard remote procedure call (RPC) mechanism that is widely used, but it is optimized for and specific to Windows. (Remote procedure calls are covered in detail in Section 3.8.2.) Like Mach, Windows uses a port object to establish and maintain a connection between two processes. Windows uses two types of ports: connection ports and communication ports. Server processes publish connection-port objects that are visible to all processes. When a client wants services from a subsystem, it opens a handle to the server's connection-port object and sends a connection request to that port. The server then creates a channel and returns a handle to the client. The channel consists of a pair of private communication ports: one for client–server messages, the other for server–client messages. Additionally, communication channels support a callback mechanism that allows the client and server to accept requests when they would normally be expecting a reply. When an ALPC channel is created, one of three message-passing techniques is chosen: 1. For small messages (up to 256 bytes), the port's message queue is used as intermediate storage, and the messages are copied from one process to the other. 2. Larger messages must be passed through a section object, which is a region of shared memory associated with the channel. 3. When the amount of data is too large to fit into a section object, an API is available that allows server processes to read and write directly into the address space of a client. The client has to decide when it sets up the channel whether it will need to send a large message. If the client determines that it does want to send large messages, it asks for a section object to be created. Similarly, if the server decides that replies will be large, it creates a section object. So that the section object can be used, a small message is sent that contains a pointer and size information about the section object. This method is more complicated than the first method listed above, but it avoids data copying. The structure of advanced local procedure calls in Windows is shown in Figure 3.19. Figure 3.19 Advanced local procedure calls in Windows. It is important to note that the ALPC facility in Windows is not part of the Windows API and hence is not visible to the application programmer. Rather, applications using the Windows API invoke standard remote procedure calls. When the RPC is being invoked on a process on the same system, the RPC is handled indirectly through an ALPC procedure call. Additionally, many kernel services use ALPC to communicate with client processes. 3.7.4 Pipes A pipe acts as a conduit allowing two processes to communicate. Pipes were one of the first IPC mechanisms in early UNIX systems. They typically provide one of the simpler ways for processes to communicate with one another, although they also have some limitations. In implementing a pipe, four issues must be considered: 1. Does the pipe allow bidirectional communication, or is communication unidirectional? 2. If two-way communication is allowed, is it half duplex (data can travel only one way at a time) or full duplex (data can travel in both directions at the same time)? 3. Must a relationship (such as parent–child) exist between the communicating processes? 4. Can the pipes communicate over a network, or must the communicating processes reside on the same machine? In the following sections, we explore two common types of pipes used on both UNIX and Windows systems: ordinary pipes and named pipes.

      Windows ALPC (Advanced Local Procedure Call) provides a message-passing mechanism for communication between processes, optimized for performance within the Windows environment. Windows uses connection ports for clients to request services from server processes. ALPC supports different message-passing methods, depending on the message size, including smaller messages using a message queue and larger ones via section objects, which leverage shared memory. Notably, ALPC is designed to handle local communication on the same machine efficiently, and its internal complexity allows clients to make decisions about message size to avoid unnecessary copying. While ALPC is a powerful IPC mechanism, it is not directly exposed to application developers but is used behind the scenes by the system's standard remote procedure call (RPC) system.

      The section on Pipes focuses on a basic but foundational IPC method in UNIX and Windows systems, where data is transmitted between two processes through a unidirectional or bidirectional conduit. The design of pipes involves considerations like whether they allow bidirectional communication, whether a relationship is required between processes (such as parent–child), and whether communication can occur across a network. Named pipes, an extension of regular pipes, allow for more flexible communication, as they can be used by unrelated processes and even across machines, though they still rely on local IPC principles.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript puts forward a statistical method to more accurately report the significance of correlations within data. The motivation for this study is two-fold. First, the publication of biological studies demands the report of p-values, and it is widely accepted that p-values below the arbitrary threshold of 0.05 give the authors of such studies justification to draw conclusions about their data. Second, many biological studies are limited by the number of replicate samples that are feasible, with replicates of less than 5 typical. The authors report a statistical tool that uses a permute-match approach to calculate p-values. Notably, the proposed method reduces p-values from around 0.2 to 0.04 as compared to a standard permutation test with a small sample size. The approach is clearly explained, including detailed mathematical explanations and derivations. The advantage of the approach is also demonstrated through analysis of computer-generated synthetic data with specified correlation and analysis of previously published data related to fish schooling. The authors make a clear case that this method is an improvement over the more standard approach currently used, and also demonstrate the impact of this methodology on the ability to obtain p-values that are the standard for biological research. Overall, this paper is very strong. While the subject matter seems somewhat specialized, I would make the case that this will be an important study that has broad general interest to readers. The findings are very general and applicable to many research contexts. Experimentalists also want to report accurate p-values in their work and better understand how these values are calculated. Although I believe the previous statement is true, I am not sure that many research groups doing biological work are reading specialized statistics journals regularly. Therefore a useful and broadly applicable statistical tool is well placed in this journal.<br /> Strengths:

      The proposed method is broadly applicable to many realistic datasets in many experimental contexts.

      The power of this method was demonstrated with both real experimental data and "synthetic" data. The advantages of the tool are clearly reported. The zebrafish data is a great example dataset.

      The method solves a real-life problem that is frequently encountered by many experimental groups in the biological sciences.

      The writing of the paper is surprisingly clear, given the technical nature of the subject matter. I would not at all consider myself a statistician or mathematician, but I found the text easy to follow. The authors did an impressive job guiding the reader through material that would often be difficult to grasp. The introduction was also well-written and clearly motivated the goals of the study.

      Weaknesses:

      A few changes could be made if the manuscript is revised. I would consider all of these points minor, but the paper could be improved if these points were addressed.

      (1) The caption of Figure 2 doesn't seem to mention panel D. Figure A-2 also does not mention C in the caption.

      (2) Figure 2D is a little hard to follow. First, the definition of "Power" is not clear, and I couldn't find the precise definition in the text. Second, the legend for the different lines in 2D is only given in Figure A-2. Perhaps a portion of the caption for Figure 2 is missing?

      (3) The concept of circular variance for the fish data was heard to understand/visualize. The equation on line 326 did not help much. If there is a very simple picture that could be added near line 326 that helps to explain Ct and theta, that could be a big help for some readers who do not work on related systems. The analysis performed is understandable, the reader just has to accept that circular variance captions the degree of alignment of the fish.

      (4) For the data discussed in Figure 3, I wasn't 100% sure how the time windows were selected. In the caption, it says "time series to different lengths starting from the first frame". So the 20 s time window was from t=0 to t= 20 s. Would a different result be obtained if a different 20 s window was chosen (from t = 4 min to t = 4 min 20 s just to give a specific example). I suppose by chance one of the time windows would give a p-value less than the target 0.05, that wouldn't be surprising. Maybe a random time window should be selected (although I am not indicating what was reported was incorrect)? A little more discussion on this aspect of the study may be helpful.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      The authors have created a system for designing and running experimental pipelines to control and coordinate different programs and devices during an experiment, called Heron. Heron is based around a graphical tool for creating a Knowledge Graph made up of nodes connected by edges, with each node representing a separate Python script, and each edge being a communication pathway connecting a specific output from one node to an iput on another. Each node also has parameters that can be set by the user during setup and runtime, and all of this behavior is concisely specified in the code that defines each node. This tool tries to marry the ease of use, clarity, and selfdocumentation of a purely graphical system like Bonsai with the flexibility and power of a purely code-based system like Robot Operating System (ROS).

      Strengths:

      The underlying idea behind Heron, of combining a graphical design and execution tool with nodes that are made as straightforward Python scripts seems like a great way to get the relative strengths of each approach. The graphical design side is clear, selfexplanatory, and self-documenting, as described in the paper. The underlying code for each node tends to also be relatively simple and straightforward, with a lot of the complex communication architecture successfully abstracted away from the user. This makes it easy to develop new nodes, without needing to understand the underlying communications between them. The authors also provide useful and well-documented templates for each type of node to further facilitate this process. Overall this seems like it could be a great tool for designing and running a wide variety of experiments, without requiring too much advanced technical knowledge from the users.

      The system was relatively easy to download and get running, following the directions and already has a significant amount of documentation available to explain how to use it and expand its capabilities. Heron has also been built from the ground up to easily incorporate nodes stored in separate Git repositories and to thus become a large community-driven platform, with different nodes written and shared by different groups. This gives Heron a wide scope for future utility and usefulness, as more groups use it, write new nodes, and share them with the community. With any system of this sort, the overall strength of the system is thus somewhat dependent on how widely it is used and contributed to, but the authors did a good job of making this easy and accessible for people who are interested. I could certainly see Heron growing into a versatile and popular system for designing and running many types of experiments.

      Weaknesses:

      (1) The number one thing that was missing from the paper was any kind of quantification of the performance of Heron in different circumstances. Several useful and illustrative examples were discussed in depth to show the strengths and flexibility of Heron, but there was no discussion or quantification of performance, timing, or latency for any of these examples. These seem like very important metrics to measure and discuss when creating a new experimental system.

      Heron is practically a thin layer of obfuscation of signal passing across processes. Given its design approach it is up to the code of each Node to deal with issues of timing, synching and latency and thus up to each user to make sure the Nodes they author fulfil their experimental requirements. Having said that, Heron provides a large number of tools to allow users to optimise the generated Knowledge Graphs for their use cases. To showcase these tools, we have expanded on the third experimental example in the paper with three extra sections, two of which relate to Heron’s performance and synching capabilities. One is focusing on Heron’s CPU load requirements (and existing Heron tools to keep those at acceptable limits) and another focusing on post experiment synchronisation of all the different data sets a multi Node experiment generates.   

      (2) After downloading and running Heron with some basic test Nodes, I noticed that many of the nodes were each using a full CPU core on their own. Given that this basic test experiment was just waiting for a keypress, triggering a random number generator, and displaying the result, I was quite surprised to see over 50% of my 8-core CPU fully utilized. I don’t think that Heron needs to be perfectly efficient to accomplish its intended purpose, but I do think that some level of efficiency is required. Some optimization of the codebase should be done so that basic tests like this can run with minimal CPU utilization. This would then inspire confidence that Heron could deal with a real experiment that was significantly more complex without running out of CPU power and thus slowing down.

      The original Heron allowed the OS to choose how to manage resources over the required process. We were aware that this could lead to significant use of CPU time, as well as occasionally significant drop of packets (which was dependent on the OS and its configuration). This drop happened mainly when the Node was running a secondary process (like in the Unity game process in the 3rd example). To mitigate these problems, we have now implemented a feature allowing the user to choose the CPU that each Node’s worker function runs on as well as any extra processes the worker process initialises. This is accessible from the Saving secondary window of the node. This stops the OS from swapping processes between CPUs and eliminates the dropping of packages due to the OS behaviour. It also significantly reduces the utilised CPU time. To showcase this, we initially run the simple example mentioned by the reviewer. The computer running only background services was using 8% of CPU (8 cores). With Heron GUI running but with no active Graph, the CPU usage went to 15%. With the Graph running and Heron’s processes running on OS attributed CPU cores, the total CPU was at 65% (so very close to the reviewer’s 50%). By choosing a different CPU core for each of the three worker processes the CPU went down to 47% and finally when all processes were forced to run on the same CPU core the CPU load dropped to 30%.  So, Heron in its current implementation running its GUI and 3 Nodes takes 22% of CPU load. This is still not ideal but is a consequence of the overhead of running multiple processes vs multiple threads. We believe that, given Heron’s latest optimisation, offering more control of system management to the user, the benefits of multi process applications outweigh this hit in system resources. 

      We have also increased the scope of the third example we provide in the paper and there we describe in detail how a full-scale experiment with 15 Nodes (which is the upper limit of number of Nodes usually required in most experiments) impacts CPU load. 

      Finally, we have added on Heron’s roadmap projects extra tasks focusing only on optimisation (profiling and using Numba for the time critical parts of the Heron code).

      (3) I was also surprised to see that, despite being meant specifically to run on and connect diverse types of computer operating systems and being written purely in Python, the Heron Editor and GUI must be run on Windows. This seems like an unfortunate and unnecessary restriction, and it would be great to see the codebase adjusted to make it fully crossplatform-compatible.

      This point was also mentioned by reviewer 2. This was a mistake on our part and has now been corrected in the paper. Heron (GUI and underlying communication functionality) can run on any machine that the underlying python libraries run, which is Windows, Linux (both for x86 and Arm architectures) and MacOS. We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). The Windows and Linux versions of Heron have undergone extensive debugging and all of the available Nodes (that are not OS specific) run on those two systems. We are in the process of debugging the Nodes’ functionality for RasPi. The MacOS version, although functional requires further work to make sure all of the basic Nodes are functional (which is not the case at the moment). We have also updated our manuscript (Multiple machines, operating systems and environments) to include the above information. 

      (4) Lastly, when I was running test experiments, sometimes one of the nodes, or part of the Heron editor itself would throw an exception or otherwise crash. Sometimes this left the Heron editor in a zombie state where some aspects of the GUI were responsive and others were not. It would be good to see a more graceful full shutdown of the program when part of it crashes or throws an exception, especially as this is likely to be common as people learn to use it. More problematically, in some of these cases, after closing or force quitting Heron, the TCP ports were not properly relinquished, and thus restarting Heron would run into an "address in use" error. Finding and killing the processes that were still using the ports is not something that is obvious, especially to a beginner, and it would be great to see Heron deal with this better. Ideally, code would be introduced to carefully avoid leaving ports occupied during a hard shutdown, and furthermore, when the address in use error comes up, it would be great to give the user some idea of what to do about it.

      A lot of effort has been put into Heron to achieve graceful shut down of processes, especially when these run on different machines that do not know when the GUI process has closed. The code that is being suggested to avoid leaving ports open has been implemented and this works properly when processes do not crash (Heron is terminated by the user) and almost always when there is a bug in a process that forces it to crash. In the version of Heron available during the reviewing process there were bugs that caused the above behaviour (Node code hanging and leaving zombie processes) on MacOS systems. These have now been fixed. There are very seldom instances though, especially during Node development, that crashing processes will hang and need to be terminated manually. We have taken on board the reviewer’s comments that users should be made more aware of these issues and have also described this situation in the Debugging part of Heron’s documentation. There we explain the logging and other tools Heron provides to help users debug their own Nodes and how to deal with hanging processes.

      Heron is still in alpha (usable but with bugs) and the best way to debug it and iron out all the bugs in all use cases is through usage from multiple users and error reporting (we would be grateful if the errors the reviewer mentions could be reported in Heron’s github Issues page). We are always addressing and closing any reported errors, since this is the only way for Heron to transition from alpha to beta and eventually to production code quality.

      Overall I think that, with these improvements, this could be the beginning of a powerful and versatile new system that would enable flexible experiment design with a relatively low technical barrier to entry. I could see this system being useful to many different labs and fields. 

      We thank the reviewer for positive and supportive words and for the constructive feedbacks. We believe we have now addressed all the raised concerns.  

      Reviewer #2 (Public Review):

      Summary:

      The authors provide an open-source graphic user interface (GUI) called Heron, implemented in Python, that is designed to help experimentalists to

      (1) design experimental pipelines and implement them in a way that is closely aligned with their mental schemata of the experiments,

      (2) execute and control the experimental pipelines with numerous interconnected hardware and software on a network.

      The former is achieved by representing an experimental pipeline using a Knowledge Graph and visually representing this graph in the GUI. The latter is accomplished by using an actor model to govern the interaction among interconnected nodes through messaging, implemented using ZeroMQ. The nodes themselves execute user-supplied code in, but not limited to, Python.

      Using three showcases of behavioral experiments on rats, the authors highlighted three benefits of their software design:

      (1) the knowledge graph serves as a self-documentation of the logic of the experiment, enhancing the readability and reproducibility of the experiment,

      (2) the experiment can be executed in a distributed fashion across multiple machines that each has a different operating system or computing environment, such that the experiment can take advantage of hardware that sometimes can only work on a specific computer/OS, a commonly seen issue nowadays,

      (3) he users supply their own Python code for node execution that is supposed to be more friendly to those who do not have a strong programming background.

      Strengths:

      (1) The software is light-weight and open-source, provides a clean and easy-to-use GUI,

      (2) The software answers the need of experimentalists, particularly in the field of behavioral science, to deal with the diversity of hardware that becomes restricted to run on dedicated systems.

      (3) The software has a solid design that seems to be functionally reliable and useful under many conditions, demonstrated by a number of sophisticated experimental setups.

      (4) The software is well documented. The authors pay special attention to documenting the usage of the software and setting up experiments using this software.

      Weaknesses:

      (1) While the software implementation is solid and has proven effective in designing the experiment showcased in the paper, the novelty of the design is not made clear in the manuscript. Conceptually, both the use of graphs and visual experimental flow design have been key features in many widely used softwares as suggested in the background section of the manuscript. In particular, contrary to the authors’ claim that only pre-defined elements can be used in Simulink or LabView, Simulink introduced MATLAB Function Block back in 2011, and Python code can be used in LabView since 2018. Such customization of nodes is akin to what the authors presented.

      In the Heron manuscript we have provided an extensive literature review of existing systems from which Heron has borrowed ideas. We never wished to say that graphs and visual code is what sets Heron apart since these are technologies predating Heron by many years and implemented by a large number of software. We do not believe also that we have mentioned that LabView or Simulink can utilise only predefined nodes. What we have said is that in such systems (like LabView, Simulink and Bonsai) the focus of the architecture is on prespecified low level elements while the ability for users to author their own is there but only as an afterthought. The difference with Heron is that in the latter the focus is on the users developing their own elements. One could think of LabView style software as node-based languages (with low level visual elements like loops and variables) that also allow extra scripting while Heron is a graphical wrapper around python where nodes are graphical representations of whole processes. To our knowledge there is no other software that allows the very fast generation of graphical elements representing whole processes whose communication can also be defined graphically. Apart from this distinction, Heron also allows a graphical approach to writing code for processes that span different machines which again to our knowledge is a novelty of our approach and one of its strongest points towards ease of experimental pipeline creation (without sacrificing expressivity). 

      (2) The authors claim that the knowledge graph can be considered as a self-documentation of an experiment. I found it to be true to some extent. Conceptually it’s a welcoming feature and the fact that the same visualization of the knowledge graph can be used to run and control experiments is highly desirable (but see point 1 about novelty). However, I found it largely inadequate for a person to understand an experiment from the knowledge graph as visualized in the GUI alone. While the information flow is clear, and it seems easier to navigate a codebase for an experiment using this method, the design of the GUI does not make it a one-stop place to understand the experiment. Take the Knowledge Graph in Supplementary Figure 2B as an example, it is associated with the first showcase in the result section highlighting this self-documentation capability. I can see what the basic flow is through the disjoint graph where 1) one needs to press a key to start a trial, and 2) camera frames are saved into an avi file presumably using FFMPEG. Unfortunately, it is not clear what the parameters are and what each block is trying to accomplish without the explanation from the authors in the main text. Neither is it clear about what the experiment protocol is without the help of Supplementary Figure 2A.

      In my opinion, text/figures are still key to documenting an experiment, including its goals and protocols, but the authors could take advantage of the fact that they are designing a GUI where this information, with properly designed API, could be easily displayed, perhaps through user interaction. For example, in Local Network -> Edit IPs/ports in the GUI configuration, there is a good tooltip displaying additional information for the "password" entry. The GUI for the knowledge graph nodes can very well utilize these tooltips to show additional information about the meaning of the parameters, what a node does, etc, if the API also enforces users to provide this information in the form of, e.g., Python docstrings in their node template. Similarly, this can be applied to edges to make it clear what messages/data are communicated between the nodes. This could greatly enhance the representation of the experiment from the Knowledge graph.

      In the first showcase example in the paper “Probabilistic reversal learning.

      Implementation as self-documentation” we go through the steps that one would follow in order to understand the functionality of an experiment through Heron’s Knowledge Graph. The Graph is not just the visual representation of the Nodes in the GUI but also their corresponding code bases. We mention that the way Heron’s API limits the way a Node’s code is constructed (through an Actor based paradigm) allows for experimenters to easily go to the code base of a specific Node and understand its 2 functions (initialisation and worker) without getting bogged down in the code base of the whole Graph (since these two functions never call code from any other Nodes). Newer versions of Heron facilitate this easy access to the appropriate code by also allowing users to attach to Heron their favourite IDE and open in it any Node’s two scripts (worker and com) when they double click on the Node in Heron’s GUI. On top of this, Heron now (in the versions developed as answers to the reviewers’ comments) allows Node creators to add extensive comments on a Node but also separate comments on the Node’s parameters and input and output ports. Those can be seen as tooltips when one hovers over the Node (a feature that can be turned off or on by the Info button on every Node).  

      As Heron stands at the moment we have not made the claim that the Heron GUI is the full picture in the self-documentation of a Graph. We take note though the reviewer’s desire to have the GUI be the only tool a user would need to use to understand an experimental implementation. The solution to this is the same as the one described by the reviewer of using the GUI to show the user the parts of the code relevant to a specific Node without the user having to go to a separate IDE or code editor. The reason this has not been implemented yet is the lack of a text editor widget in the underlying gui library (DearPyGUI). This is in their roadmap for their next large release and when this exists we will use it to implement exactly the idea the reviewer is suggesting, but also with the capability to not only read comments and code but also directly edit a Node’s code (see Heron’s roadmap). Heron’s API at the moment is ideal for providing such a text editor straight from the GUI.

      (3) The design of Heron was primarily with behavioral experiments in mind, in which highly accurate timing is not a strong requirement. Experiments in some other areas that this software is also hoping to expand to, for example, electrophysiology, may need very strong synchronization between apparatus, for example, the record timing and stimulus delivery should be synced. The communication mechanism implemented in Heron is asynchronous, as I understand it, and the code for each node is executed once upon receiving an event at one or more of its inputs. The paper, however, does not include a discussion, or example, about how Heron could be used to address issues that could arise in this type of communication. There is also a lack of information about, for example, how nodes handle inputs when their ability to execute their work function cannot keep up with the frequency of input events. Does the publication/subscription handle the queue intrinsically? Will it create problems in real-time experiments that make multiple nodes run out of sync? The reader could benefit from a discussion about this if they already exist, and if not, the software could benefit from implementing additional mechanisms such that it can meet the requirements from more types of experiments.

      In order to address the above lack of explanation (that also the first reviewer pointed out) we expanded the third experimental example in the paper with three more sections. One focuses solely on explaining how in this example (which acquires and saves large amounts of data from separate Nodes running on different machines) one would be able to time align the different data packets generated in different Nodes to each other. The techniques described there are directly implementable on experiments where the requirements of synching are more stringent than the behavioural experiment we showcase (like in ephys experiments). 

      Regarding what happens to packages when the worker function of a Node is too slow to handle its traffic, this is mentioned in the paper (Code architecture paragraph): “Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running.” This is also explained in more detail in Heron’s documentation. The reasoning for a no buffer system (as described in the documentation) is that for the use cases Heron is designed to handle we believe there is no situation where a Node would receive large amounts of data in bursts while very little data during the rest of the time (in which case a buffer would make sense). Nodes in most experiments will either be data intensive but with a constant or near constant data receiving speed (e.g. input from a camera or ephys system) or will have variable data load reception but always with small data loads (e.g. buttons). The second case is not an issue and the first case cannot be dealt with a buffer but with the appropriate code design, since buffering data coming in a Node too slow for its input will just postpone the inevitable crash. Heron’s architecture principle in this case is to allow these ‘mistakes’ (i.e. package dropping) to happen so that the pipeline continues to run and transfer the responsibility of making Nodes fast enough to the author of each Node. At the same time Heron provides tools (see the Debugging section of the documentation and the time alignment paragraph of the “Rats playing computer games”  example in the manuscript) that make it easy to detect package drops and either correct them or allow them but also allow time alignment between incoming and outgoing packets. In the very rare case where a buffer is required Heron’s do-it-yourself logic makes it easy for a Node developer to implement their own Node specific buffer.

      (4) The authors mentioned in "Heron GUI’s multiple uses" that the GUI can be used as an experimental control panel where the user can update the parameters of the different Nodes on the fly. This is a very useful feature, but it was not demonstrated in the three showcases. A demonstration could greatly help to support this claim.

      As the reviewer mentions, we have found Heron’s GUI double role also as an experimental on-line controller a very useful capability during our experiments. We have expanded the last experimental example to also showcase this by showing how on the “Rats playing computer games” experiment we used the parameters of two Nodes to change the arena’s behaviour while the experiment was running, depending on how the subject was behaving at the time (thus exploring a much larger set of parameter combinations, faster during exploratory periods of our shaping protocols construction). 

      (5) The API for node scripts can benefit from having a better structure as well as having additional utilities to help users navigate the requirements, and provide more guidance to users in creating new nodes. A more standard practice in the field is to create three abstract Python classes, Source, Sink, and Transform that dictate the requirements for initialisation, work_function, and on_end_of_life, and provide additional utility methods to help users connect between their code and the communication mechanism. They can be properly docstringed, along with templates. In this way, the com and worker scripts can be merged into a single unified API. A simple example that can cause confusion in the worker script is the "worker_object", which is passed into the initialise function. It is unclear what this object this variable should be, and what attributes are available without looking into the source code. As the software is also targeting those who are less experienced in programming, setting up more guidance in the API can be really helpful. In addition, the self-documentation aspect of the GUI can also benefit from a better structured API as discussed in point 2 above.

      The reviewer is right that using abstract classes to expose to users the required API would be a more standard practice. The reason we did not choose to do this was to keep Heron easily accessible to entry level Python programmers who do not have familiarity yet with object oriented programming ideas. So instead of providing abstract classes we expose only the implementation of three functions which are part of the worker classes but the classes themselves are not seen by the users of the API. The point about the users’ accessibility to more information regarding a few objects used in the API (the worker object for example) has been taken on board and we have now addressed this by type hinting all these objects both in the templates and more importantly in the automatically generated code that Heron now creates when a user chooses to create a Node graphically (a feature of Heron not present in the version available in the initial submission of this manuscript).  

      (6) The authors should provide more pre-defined elements. Even though the ability for users to run arbitrary code is the main feature, the initial adoption of a codebase by a community, in which many members are not so experienced with programming, is the ability for them to use off-the-shelf components as much as possible. I believe the software could benefit from a suite of commonly used Nodes.

      There are currently 12 Node repositories in the Heron-repositories project on Github with more than 30 Nodes, 20 of which are general use (not implementing a specific experiment’ logic). This list will continue to grow but we fully appreciate the truth of the reviewer’s comment that adoption will depend on the existence of a large number of commonly used Nodes (for example Numpy, and OpenCV Nodes) and are working towards this goal.

      (7) It is not clear to me if there is any capability or utilities for testing individual nodes without invoking a full system execution. This would be critical when designing new experiments and testing out each component.

      There is no capability to run the code of an individual Node outside Heron’s GUI. A user could potentially design and test parts of the Node before they get added into a Node but we have found this to be a highly inefficient way of developing new Nodes. In our hands the best approach for Node development was to quickly generate test inputs and/or outputs using the “User Defined Function 1I 1O” Node where one can quickly write a function and make it accessible from a Node. Those test outputs can then be pushed in the Node under development or its outputs can be pushed in the test function, to allow for incremental development without having to connect it to the Nodes it would be connected in an actual pipeline. For example, one can easily create a small function that if a user presses a key will generate the same output (if run from a “User Defined Function 1I 1O” Node) as an Arduino Node reading some buttons. This output can then be passed into an experiment logic Node under development that needs to do something with this input. In this way during a Node development Heron allows the generation of simulated hardware inputs and outputs without actually running the actual hardware. We have added this way of developing Nodes also in our manuscript (Creating a new Node).

      Reviewer #3 (Public Review):

      Summary:

      The authors present a Python tool, Heron, that provides a framework for defining and running experiments in a lab setting (e.g. in behavioural neuroscience). It consists of a graphical editor for defining the pipeline (interconnected nodes with parameters that can pass data between them), an API for defining the nodes of these pipelines, and a framework based on ZeroMQ, responsible for the overall control and data exchange between nodes. Since nodes run independently and only communicate via network messages, an experiment can make use of nodes running on several machines and in separate environments, including on different operating systems.

      Strengths:

      As the authors correctly identify, lab experiments often require a hodgepodge of separate hardware and software tools working together. A single, unified interface for defining these connections and running/supervising the experiment, together with flexibility in defining the individual subtasks (nodes) is therefore a very welcome approach. The GUI editor seems fairly intuitive, and Python as an accessible programming environment is a very sensible choice. By basing the communication on the widely used ZeroMQ framework, they have a solid base for the required non-trivial coordination and communication. Potential users reading the paper will have a good idea of how to use the software and whether it would be helpful for their own work. The presented experiments convincingly demonstrate the usefulness of the tool for realistic scientific applications.

      Weaknesses:

      (1) In my opinion, the authors somewhat oversell the reproducibility and "selfdocumentation" aspect of their solution. While it is certainly true that the graph representation gives a useful high-level overview of an experiment, it can also suffer from the same shortcomings as a "pure code" description of a model - if a user gives their nodes and parameters generic/unhelpful names, reading the graph will not help much. 

      This is a problem that to our understanding no software solution can possibly address. Yet having a visual representation of how different inputs and outputs connect to each other we argue would be a substantial benefit in contrast to the case of “pure code” especially when the developer of the experiment has used badly formatted variable names.

      (2) Making the link between the nodes and the actual code is also not straightforward, since the code for the nodes is spread out over several directories (or potentially even machines), and not directly accessible from within the GUI. 

      This is not accurate. The obligatory code of a Node always exists within a single folder and Heron’s API makes it rather cumbersome to spread scripts relating to a Node across separate folders. The Node folder structure can potentially be copied over different machines but this is why Heron is tightly integrated with git practices (and even politely asks the user with popup windows to create git repositories of any Nodes they create whilst using Heron’s automatic Node generator system). Heron’s documentation is also very clear on the folder structure of a Node which keeps the required code always in the same place across machines and more importantly across experiments and labs. Regarding the direct accessibility of the code from the GUI, we took on board the reviewers’ comments and have taken the first step towards correcting this. Now one can attach to Heron their favourite IDE and then they can double click on any Node to open its two main scripts (com and worker) in that IDE embedded in whatever code project they choose (also set in Heron’s settings windows). On top of this, Heron now allows the addition of notes both for a Node and for all its parameters, inputs and outputs which can be viewed by hovering the mouse over them on the Nodes’ GUIs. The final step towards GUI-code integration will be to have a Heron GUI code editor but this is something that has to wait for further development from Heron’s underlying GUI library DearPyGUI.

      (3) The authors state that "[Heron’s approach] confers obvious benefits to the exchange and reproducibility of experiments", but the paper does not discuss how one would actually exchange an experiment and its parameters, given that the graph (and its json representation) contains user-specific absolute filenames, machine IP addresses, etc, and the parameter values that were used are stored in general data frames, potentially separate from the results. Neither does it address how a user could keep track of which versions of files were used (including Heron itself).

      Heron’s Graphs, like any experimental implementation, must contain machine specific strings. These are accessible either from Heron’s GUI when a Graph json file is opened or from the json file itself. Heron in this regard does not do anything different to any other software, other than saving the graphs into human readable json files that users can easily manipulate directly.

      Heron provides a method for users to save every change of the Node parameters that might happen during an experiment so that it can be fully reproduced. The dataframes generated are done so in the folders specified by the user in each of the Nodes (and all those paths are saved in the json file of the Graph). We understand that Heron offers a certain degree of freedom to the user (Heron’s main reason to exist is exactly this versatility) to generate data files wherever they want but makes sure every file path gets recorded for subsequent reproduction. So, Heron behaves pretty much exactly like any other open source software. What we wanted to focus on as the benefits of Heron on exchange and reproducibility was the ability of experimenters to take a Graph from another lab (with its machine specific file paths and IP addresses) and by examining the graphical interface of it to be able to quickly tweak it to make it run on their own systems. That is achievable through the fact that a Heron experiment will be constructed by a small amount of Nodes (5 to 15 usually) whose file paths can be trivially changed in the GUI or directly in the json file while the LAN setup of the machines used can be easily reconstructed from the information saved in the secondary GUIs.

      Where Heron needs to improve (and this is a major point in Heron’s roadmap) is the need to better integrate the different saved experiments with the git versions of Heron and the Nodes that were used for that specific save. This, we appreciate is very important for full reproducibility of the experiment and it is a feature we will soon implement. More specifically users will save together with a graph the versions of all the used repositories and during load the code base utilised will come from the recorded versions and not from the current head of the different repositories. This is a feature that we are currently working on now and as our roadmap suggests will be implemented by the release of Heron 1.0. 

      (4) Another limitation that in my opinion is not sufficiently addressed is the communication between the nodes, and the effect of passing all communications via the host machine and SSH. What does this mean for the resulting throughput and latency - in particular in comparison to software such as Bonsai or Autopilot? The paper also states that "Heron is designed to have no message buffering, thus automatically dropping any messages that come into a Node’s inputs while the Node’s worker function is still running."- it seems to be up to the user to debug and handle this manually?

      There are a few points raised here that require addressing. The first is Heron’s requirement to pass all communication through the main (GUI) machine. We understand (and also state in the manuscript) that this is a limitation that needs to be addressed. We plan to do this is by adding to Heron the feature of running headless (see our roadmap). This will allow us to run whole Heron pipelines in a second machine which will communicate with the main pipeline (run on the GUI machine) with special Nodes. That will allow experimenters to define whole pipelines on secondary machines where the data between their Nodes stay on the machine running the pipeline. This is an important feature for Heron and it will be one of the first features to be implemented next (after the integration of the saving system with git). 

      The second point is regarding Heron’s throughput latency. In our original manuscript we did not have any description of Heron’s capabilities in this respect and both other reviewers mentioned this as a limitation. As mentioned above, we have now addressed this by adding a section to our third experimental example that fully describes how much CPU is required to run a full experimental pipeline running on two machines and utilising also non python code executables (a Unity game). This gives an overview of how heavy pipelines can run on normal computers given adequate optimisation and utilising Heron’s feature of forcing some Nodes to run their Worker processes on a specific core. At the same time, Heron’s use of 0MQ protocol makes sure there are no other delays or speed limitations to message passing. So, message passing within the same machine is just an exchange of memory pointers while messages passing between different machines face the standard speed limitations of the Local Access Network’s ethernet card speeds. 

      Finally, regarding the message dropping feature of Heron, as mentioned above this is an architectural decision given the use cases of message passing we expect Heron to come in contact with. For a full explanation of the logic here please see our answer to the 3rd comment by Reviewer 2.

      (5) As a final comment, I have to admit that I was a bit confused by the use of the term "Knowledge Graph" in the title and elsewhere. In my opinion, the Heron software describes "pipelines" or "data workflows", not knowledge graphs - I’d understand a knowledge graph to be about entities and their relationships. As the authors state, it is usually meant to make it possible to "test propositions against the knowledge and also create novel propositions" - how would this apply here?

      We have described Heron as a Knowledge Graph instead of a pipeline, data workflow or computation graph in order to emphasise Heron’s distinct operation in contrast to what one would consider a standard pipeline and data workflow generated by other visual based software (like LabView and Bonsai). This difference exists on what a user should think of as the base element of a graph, i.e. the Node. In all other visual programming paradigms, the Node is defined as a low-level computation, usually a language keyword, language flow control or some simple function. The logic in this case is generated by composing together the visual elements (Nodes). In Heron the Node is to be thought of as a process which can be of arbitrary complexity and the logic of the graph is composed by the user both within each Node and by the way the Nodes are combined together. This is an important distinction in Heron’s basic operation logic and it is we argue the main way Heron allows flexibility in what can be achieved while retaining ease of graph composition (by users defining their own level of complexity and functionality encompassed within each Node). We have found that calling this approach a computation graph (which it is) or a pipeline or data workflow would not accentuate this difference. The term Knowledge Graph was the most appropriate as it captures the essence of variable information complexity (even in terms of length of shortest string required) defined by a Node.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      -  No buffering implies dropped messages when a node is busy. It seems like this could be very problematic for some use cases... 

      This is a design principle of Heron. We have now provided a detailed explanation of the reasoning behind it in our answer to Reviewer 2 (Paragraph 3) as well as in the manuscript. 

      -  How are ssh passwords stored, and is it secure in some way or just in plain text?  

      For now they are plain text in an unencrypted file that is not part of the repo (if one gets Heron from the repo). Eventually, we would like to go to private/public key pairs but this is not a priority due to the local nature of Heron’s use cases (all machines in an experiment are expected to connect in a LAN).  

      Minor notes / copyedits:

      -  Figure 2A: right and left seem to be reversed in the caption. 

      They were. This is now fixed. 

      -  Figure 2B: the text says that proof of life messages are sent to each worker process but in the figure, it looks like they are published by the workers? Also true in the online documentation.  

      The Figure caption was wrong. This is now fixed.

      -  psutil package is not included in the requirements for GitHub

      We have now included psutil in the requirements.

      -  GitHub readme says Python >=3.7 but Heron will not run as written without python >= 3.9 (which is alluded to in the paper)

      The new Heron updates require Python 3.11. We have now updated GitHub and the documentation to reflect this.

      -  The paper mentions that the Heron editor must be run on Windows, but this is not mentioned in the Github readme.  

      This was an error in the manuscript that we have now corrected.

      -  It’s unclear from the readme/manual how to remove a node from the editor once it’s been added.  

      We have now added an X button on each Node to complement the Del button on the keyboard (for MacOS users that do not have this button most of the times).

      -  The first example experiment is called the Probabilistic Reversal Learning experiment in text, but the uncertainty experiment in the supplemental and on GitHub.  

      We have now used the correct name (Probabilistic Reversal Learning) in both the supplemental material and on GitHub

      -  Since Python >=3.9 is required, consider using fstrings instead of str.format for clarity in the codebase  

      Thank you for the suggestion. Latest Heron development has been using f strings and we will do a refactoring in the near future.

      -  Grasshopper cameras can run on linux as well through the spinnaker SDK, not just Windows.  

      Fixed in the manuscript. 

      -  Figure 4: Square and star indicators are unclear.

      Increased the size of the indicators to make them clear.

      -  End of page 9: "an of the self" presumably a typo for "off the shelf"?  

      Corrected.

      -  Page 10 first paragraph. "second root" should be "second route"

      Corrected.

      -  When running Heron, the terminal constantly spams Blowfish encryption deprecation warnings, making it difficult to see the useful messages.  

      The solution to this problem is to either update paramiko or install Heron through pip. This possible issue is mentioned in the documentation.

      -  Node input /output hitboxes in the GUI are pretty small. If they could be bigger it would make it easier to connect nodes reliably without mis-clicks.

      We have redone the Node GUI, also increasing the size of the In/Out points.

      Reviewer #2 (Recommendations For The Authors):

      (1) There are quite a few typos in the manuscript, for example: "one can accessess the code", "an of the self", etc.  

      Thanks for the comment. We have now screened the manuscript for possible typos.

      (2) Heron’s GUI can only run on Windows! This seems to be the opposite of the key argument about the portability of the experimental setup.  

      As explained in the answers to Reviewer 1, Heron can run on most machines that the underlying python libraries run, i.e. Windows and Linux (both for x86 and Arm architectures). We have tested it on Windows (10 and 11, both x64), Linux PC (Ubuntu 20.04.6, x64) and Raspberry Pi 4 (Debian GNU/Linux 12 (bookworm), aarch64). We have now revised the manuscript and the GitHub repo to reflect this.

      (3) Currently, the output is displayed along the left edge of the node, but the yellow dot connector is on the right. It would make more sense to have the text displayed next to the connectors.  

      We have redesigned the Node GUI and have now placed the Out connectors on the right side of the Node.

      (4) The edges are often occluded by the nodes in the GUI. Sometimes it leads to some confusion, particularly when the number of nodes is large, e.g., Fig 4.

      This is something that is dependent on the capabilities of the DearPyGUI module. At the moment there is no way to control the way the edges are drawn.

      Reviewer #3 (Recommendations For The Authors):

      A few comments on the software and the documentation itself:

      - From a software engineering point of view, the implementation seems to be rather immature. While I get the general appeal of "no installation necessary", I do not think that installing dependencies by hand and cloning a GitHub repository is easier than installing a standard package.

      We have now added a pip install capability which also creates a Heron command line command to start Heron with. 

      -The generous use of global variables to store state (minor point, given that all nodes run in different processes), boilerplate code that each node needs to repeat, and the absence of any kind of automatic testing do not give the impression of a very mature software (case in point: I had to delete a line from editor.py to be able to start it on a non-Windows system).  

      As mentioned, the use of global variables in the worker scripts is fine partly due to the multi process nature of the development and we have found it is a friendly approach to Matlab users who are just starting with Python (a serious consideration for Heron). Also, the parts of the code that would require a singleton (the Editor for example) are treated as scripts with global variables while the parts that require the construction of objects are fully embedded in classes (the Node for example). A future refactoring might make also all the parts of the code not seen by the user fully object oriented but this is a decision with pros and cons needing to be weighted first. 

      Absence of testing is an important issue we recognise but Heron is a GUI app and nontrivial unit tests would require some keystroke/mouse movement emulator (like QTest of pytest-qt for QT based GUIs). This will be dealt with in the near future (using more general solutions like PyAutoGUI) but it is something that needs a serious amount of effort (quite a bit more that writing unit tests for non GUI based software) and more importantly it is nowhere as robust as standard unit tests (due to the variable nature of the GUI through development) making automatic test authoring an almost as laborious a process as the one it is supposed to automate.

      -  From looking at the examples, I did not quite see why it is necessary to write the ..._com.py scripts as Python files, since they only seem to consist of boilerplate code and variable definitions. Wouldn’t it be more convenient to represent this information in configuration files (e.g. yaml or toml)?  

      The com is not a configuration file, it is a script that launches the communication process of the Node. We could remove the variable definitions to a separate toml file (which then the com script would have to read). The pros and cons of such a set up should be considered in a future refactoring.

      Minor comments for the paper:

      -  p.7 (top left): "through its return statement" - the worker loop is an infinite loop that forwards data with a return statement?  

      This is now corrected. The worker loop is an infinite loop and does not return anything but at each iteration pushes data to the Nodes output.

      -  p.9 (bottom right): "of the self" → "off-the-shelf"  

      Corrected.

      -  p.10 (bottom left): "second root" → "second route"  

      Corrected.

      -  Supplementary Figure 3: Green start and square seem to be swapped (the green star on top is a camera image and the green star on the bottom is value visualization - inversely for the green square).  

      The star and square have been swapped around.

      -  Caption Supplementary Figure 4 (end): "rashes to receive" → "rushes to receive"  

      Corrected.

    1. Out on the street, the largest riot since Conscription was passed in 1944 (bringing in the draft for the final year of the Second World War) broke out along a seven-block length of Rue Ste. Catherine, featuring overturned cars, smashed windows, a shot fired from somewhere and 137 arrests.

      The reason for the fight is obviously much bigger than just hockey.

    1. Shattered windows and the sound of drumsPeople couldn't believe what I'd become

      "Shattered windows and the sounds of drums" uses visual imagery and suggests that people might have started to turn and rebel against him. "People couldn't believe what I'd become" tells me that there were people still confused and unsure about what had happened.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates whether pupil dilation reflects prediction error signals during associative learning, defined formally by Kullback-Leibler (KL) divergence, an information-theoretic measure of information gain. Two independent tasks with different entropy dynamics (decreasing and increasing uncertainty) were analyzed: the cue-target 2AFC task and the letter-color 2AFC task. Results revealed that pupil responses scaled with KL divergence shortly after feedback onset, but the direction of this relationship depended on whether uncertainty (entropy) increased or decreased across trials. Furthermore, signed prediction errors (interaction between frequency and accuracy) emerged at different time windows across tasks, suggesting task-specific temporal components of model updating. Overall, the findings highlight that pupil dilation reflects information-theoretic processes in a complex, context-dependent manner.

      Strengths:

      This study provides a novel and convincing contribution by linking pupil dilation to information-theoretic measures, such as KL divergence, supporting Zénon's hypothesis that pupil responses reflect information gained during learning. The robust methodology, including two independent datasets with distinct entropy dynamics, enhances the reliability and generalisability of the findings. By carefully analysing early and late time windows, the authors capture the temporal dynamics of prediction error signals, offering new insights into the timing of model updates. The use of an ideal learner model to quantify prediction errors, surprise, and entropy provides a principled framework for understanding the computational processes underlying pupil responses. Furthermore, the study highlights the critical role of task context - specifically increasing versus decreasing entropy - in shaping the directionality and magnitude of these effects, revealing the adaptability of predictive processing mechanisms.

      Weaknesses:

      While this study offers important insights, several limitations remain. The two tasks differ significantly in design (e.g., sensory modality and learning type), complicating direct comparisons and limiting the interpretation of differences in pupil dynamics. Importantly, the apparent context-dependent reversal between pupil constriction and dilation in response to feedback raises concerns about how these opposing effects might confound the observed correlations with KL divergence. Finally, subjective factors such as participants' confidence and internal belief states were not measured, despite their potential influence on prediction errors and pupil responses.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: 

      In this work, Noorman and colleagues test the predictions of the "four-stage model" of consciousness by combining psychophysics and scalp EEG in humans. The study relies on an elegant experimental design to investigate the respective impact of attentional and perceptual blindness on visual processing. 

      The study is very well summarised, the text is clear and the methods seem sound. Overall, a very solid piece of work. I haven't identified any major weaknesses. Below I raise a few questions of interpretation that may possibly be the subject of a revision of the text. 

      We thank the reviewer for their positive assessment of our work and for their extremely helpful and constructive comments that helped to significantly improve the quality of our manuscript.

      (1) The perceptual performance on Fig1D appears to show huge variation across participants, with some participants at chance levels and others with performance > 90% in the attentional blink and/or masked conditions. This seems to reveal that the procedure to match performance across participants was not very successful. Could this impact the results? The authors highlight the fact that they did not resort to postselection or exclusion of participants, but at the same time do not discuss this equally important point. 

      Performance was indeed highly variable between observers, as is commonly found in attentional-blink (AB) and masking studies. For some observers, the AB pushes performance almost to chance level, whereas for others it has almost no effect. A similar effect can be seen in masking. We did our best to match accuracy over participants, while also matching accuracy within participants as well as possible, adjusting mask contrast manually during the experimental session. Naturally, those that are strongly affected by masking need not be the same participants as those that are strongly affected by the AB, given the fact that they rely on different mechanisms (which is also one of the main points of the manuscript). To answer the research question, what mattered most was that at the group-level, performance was well matched between the two key conditions. As all our statistical inferences, both for behavior and EEG decoding, rest on this group level. We do not think that variability at the individualsubject level detracts from this general approach.  

      In the Results, we added that our goal was to match performance across participants:

      “Importantly, mask contrast in the masked condition was adjusted using a staircasing procedure to match performance in the AB condition, ensuring comparable perceptual performance in the masked and the AB condition across participants (see Methods for more details).”

      In the Methods, we added:

      “Second, during the experimental session, after every 32 masked trials, mask contrast could be manually updated in accordance with our goal to match accuracy over participants, while also matching accuracy within participants as well as possible.”

      (2) In the analysis on collinearity and illusion-specific processing, the authors conclude that the absence of a significant effect of training set demonstrates collinearity-only processing. I don't think that this conclusion is warranted: as the illusory and nonillusory share the same shape, so more elaborate object processing could also be occurring. Please discuss. 

      We agree with this qualification of our interpretation, and included the reviewer’s account as an alternative explanation in the Discussion section:  

      “It should be noted that not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processes representing the triangular shapes as well.”

      (3) Discussion, lines 426-429: It is stated that the results align with the notion that processes of perceptual segmentation and organization represent the mechanism of conscious experience. My interpretation of the results is that they show the contrary: for the same visibility level in the attentional blind or masking conditions, these processes can be implicated or not, which suggests a role during unconscious processing instead. 

      We agree with the reviewer that the interpretation of this result depends on the definition of consciousness that one adheres to. If one takes report as the leading metric for consciousness (=conscious access), one can indeed conclude that perceptual segmentation/organization can also occur unconsciously. However, if the processing that results in the qualitative nature of an image (rather than whether it is reported) is taken as leading – such as the processing that results in the formation of an illusory percept – (=phenomenal) the conclusion can be quite different. This speaks to the still ongoing debate regarding the existence of phenomenal vs access consciousness, and the literature on no-report paradigms amongst others (see last paragraph of the discussion). Because the current data do not speak directly to this debate, we decided to remove  the sentence about “conscious experience”, and edited this part of the manuscript (also addressing a comment about preserved unconscious processing during masking by Reviewer 2) by limiting the interpretation of unconscious processing to those aspects that are uncontroversial:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling deep unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.”

      (4) The two paradigms developed here could be used jointly to highlight nonidiosyncratic NCCs, i.e. EEG markers of visibility or confidence that generalise regardless of the method used. Have the authors attempted to train the classifier on one method and apply it to another (e.g. AB to masking and vice versa)? What perceptual level is assumed to transfer? 

      To avoid issues with post-hoc selection of (visible vs. invisible) trials (discussed in the Introduction), we did not divide our trials into conscious and unconscious trials, and thus did not attempt to reveal NCCs, or NCCs generalizing across the two paradigms. Note also that this approach alone would not resolve the debate regarding the ‘true’ NCC as it hinges on the operational definition of consciousness one adheres to; also see our response to the previous point the reviewer raised. Our main analysis revealed that the illusory triangle could be decoded with above-chance accuracy during both masking and the AB over extended periods of time with similar topographies (Fig. 2B), so that significant cross-decoding would be expected over roughly the same extended period of time (except for the heightened 200-250 ms peak). However, as our focus was on differences between the two manipulations and because we did not use post-hoc sorting of trials, we did not add these analyses.

      (5) How can the results be integrated with the attentional literature showing that attentional filters can be applied early in the processing hierarchy? 

      Compared to certain manipulations of spatial attention, the AB phenomenon is generally considered to represent an instance of  “late” attentional filtering. In the Discussion section we included a paragraph on classic load theory, where early and late filtering depend on perceptual and attentional load. Just preceding this paragraph, we added this:  

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      Reviewer #2 (Public Review): 

      Summary: 

      This is a very elegant and important EEG study that unifies within a single set of behaviorally equated experimental conditions conscious access (and therefore also conscious access failures) during visual masking and attentional blink (AB) paradigms in humans. By a systematic and clever use of multivariate pattern classifiers across conditions, they could dissect, confirm, and extend a key distinction (initially framed within the GNWT framework) between 'subliminal' and 'pre-conscious' unconscious levels of processing. In particular, the authors could provide strong evidence to distinguish here within the same paradigm these two levels of unconscious processing that precede conscious access : (i) an early (< 80ms) bottom-up and local (in brain) stage of perceptual processing ('local contrast processing') that was preserved in both unconscious conditions, (ii) a later stage and more integrated processing (200-250ms) that was impaired by masking but preserved during AB. On the basis of preexisting studies and theoretical arguments, they suggest that this later stage could correspond to lateral and local recurrent feedback processes. Then, the late conscious access stage appeared as a P3b-like event. 

      Strengths: 

      The methodology and analyses are strong and valid. This work adds an important piece in the current scientific debate about levels of unconscious processing and specificities of conscious access in relation to feed-forward, lateral, and late brain-scale top-down recurrent processing. 

      Weaknesses: 

      - The authors could improve clarity of the rich set of decoding analyses across conditions. 

      - They could also enrich their Introduction and Discussion sections by taking into account the importance of conscious influences on some unconscious cognitive processes (revision of traditional concept of 'automaticity'), that may introduce some complexity in Results interpretation 

      - They should discuss the rich literature reporting high-level unconscious processing in masking paradigms (culminating in semantic processing of digits, words or even small group of words, and pictures) in the light of their proposal (deeper unconscious processing during AB than during masking). 

      We thank the reviewer for their positive assessment of our study and for their insightful comments and helpful suggestions that helped to significantly strengthen our paper. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we followed the reviewer’s suggestions and revised the Results/Discussion to include references to influences on unconscious processes and expanded our discussion of unconscious effects during masking vs. AB.  

      Reviewer #3 (Public Review): 

      Summary: 

      This work aims to investigate how perceptual and attentional processes affect conscious access in humans. By using multivariate decoding analysis of electroencephalography (EEG) data, the authors explored the neural temporal dynamics of visual processing across different levels of complexity (local contrast, collinearity, and illusory perception). This is achieved by comparing the decidability of an illusory percept in matched conditions of perceptual (i.e., degrading the strength of sensory input using visual masking) and attentional impairment (i.e., impairing topdown attention using attentional blink, AB). The decoding results reveal three distinct temporal responses associated with the three levels of visual processing. Interestingly, the early stage of local contrast processing remains unaffected by both masking and AB. However, the later stage of collinearity and illusory percept processing are impaired by the perceptual manipulation but remain unaffected by the attentional manipulation. These findings contribute to the understanding of the unique neural dynamics of perceptual and attentional functions and how they interact with the different stages of conscious access. 

      Strengths: 

      The study investigates perceptual and attentional impairments across multiple levels of visual processing in a single experiment. Local contrast, collinearity, and illusory perception were manipulated using different configurations of the same visual stimuli. This clever design allows for the investigation of different levels of visual processing under similar low-level conditions. 

      Moreover, behavioural performance was matched between perceptual and attentional manipulations. One of the main problems when comparing perceptual and attentional manipulations on conscious access is that they tend to impact performance at different levels, with perceptual manipulations like masking producing larger effects. The study utilizes a staircasing procedure to find the optimal contrast of the mask stimuli to produce a performance impairment to the illusory perception comparable to the attentional condition, both in terms of perceptual performance (i.e., indicating whether the target contained the Kanizsa illusion) and metacognition (i.e., confidence in the response). 

      The results show a clear dissociation between the three levels of visual processing in terms of temporal dynamics. Local contrast was represented at an early stage (~80 ms), while collinearity and illusory perception were associated with later stages (~200-250 ms). Furthermore, the results provide clear evidence in support of a dissociation between the effects of perceptual and attentional processes on conscious access: while the former affected both neuronal correlates of collinearity and illusory perception, the latter did not have any effect on the processing of the more complex visual features involved in the illusion perception. 

      Weaknesses: 

      The design of the study and the results presented are very similar to those in Fahrenfort et al. (2017), reducing its novelty. Similar to the current study, Fahrenfort et al. (2017) tested the idea that if both masking and AB impact perceptual integration, they should affect the neural markers of perceptual integration in a similar way. They found that behavioural performance (hit/false alarm rate) was affected by both masking and AB, even though only the latter was significant in the unmasked condition. An early classification peak was instead only affected by masking. However, a late classification peak showed a pattern similar to the behavioural results, with classification affected by both masking and AB. 

      The interpretation of the results mainly centres on the theoretical framework of the recurrent processing theory of consciousness (Lamme, 2020), which lead to the assumption that local contrast, collinearity, and the illusory perception reflect feedforward, local recurrent, and global recurrent connections, respectively. It should be mentioned, however, that this theoretical prediction is not directly tested in the study. Moreover, the evidence for the dissociation between illusion and collinearity in terms of lateral and feedback connections seems at least limited. For instance, Kok et al. (2016) found that, whereas bottom-up stimulation activated all cortical layers, feedback activity induced by illusory figures led to a selective activation of the deep layers. Lee & Nguyen (2001), instead, found that V1 neurons respond to illusory contours of the Kanizsa figures, particularly in the superficial layers. They all mention feedback connections, but none seem to point to lateral connections. 

      Moreover, the evidence in favour of primarily lateral connections driving collinearity seems mixed as well. On one hand, Liang et al. (2017) showed that feedback and lateral connections closely interact to mediate image grouping and segmentation. On the other hand, Stettler et al. (2002) showed that, whereas the intrinsic connections link similarly oriented domains in V1, V2 to V1 feedback displays no such specificity. Furthermore, the other studies mentioned in the manuscript did not investigate feedback connections but only lateral ones, making it difficult to draw any clear conclusions. 

      We thank the reviewer for their careful review and positive assessment of our study, as well as for their constructive criticism and helpful suggestions. We provide a more detailed point-by-point response in the “recommendations for the authors” section below. In brief, we addressed the reviewer’s comments and suggestions by better relating our study to Fahrenfort et al.’s (2017) paper and by highlighting the limitations inherent in linking our findings to distinct neural mechanisms (in particular, to lateral vs. feedback connections).

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors): 

      -  Methods: it states that "The distance between the three Pac-Man stimuli as well as between the three aligned two-legged white circles was 2.8 degrees of visual angle". It is unclear what this distance refers to. Is it the shortest distance between the edges of the objects? 

      It is indeed the shortest distance between the edges of the objects. This is now included in the Methods.

      -  Methods: It's unclear to me if the mask updating procedure during the experimental session was based on detection rate or on the perceptual performance index reported on Fig1D. Please clarify. 

      It was based on accuracy calculated over 32 trials. We have included this information in the Methods.

      -  Methods and Results: I did not understand why the described procedure used to ensure that confidence ratings are not contaminated by differences in perceptual performance was necessary. To me, it just seems to make the "no manipulations" and "both manipulations" less comparable to the other 2 conditions. 

      To calculate accurate estimates of metacognitive sensitivity for the two matched conditions, we wanted participants to make use of the full confidence scale (asking them to distribute their responses evenly over all ratings within a block). By mixing all conditions in the same block, we would have run the risk of participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition). We made this point explicit in the Results section and in the Methods section:

      “To ensure that the distribution of confidence ratings in the performancematched masked and AB condition was not influenced by participants anchoring their confidence ratings to the unmatched very easy and very difficult conditions (no and both manipulations condition, respectively), the masked and AB condition were presented in the same experimental block, while the other block type included the no and both manipulations condition.”

      “To ensure that confidence ratings for these matched conditions (masked, long lag and unmasked, short lag) were not influenced by participants anchoring their confidence ratings to the very easy and very difficult unmatched conditions (no and both manipulations, respectively), one type of block only contained the matched conditions, while the other block type contained the two remaining, unmatched conditions (masked, short lag and unmasked, long lag).”

      - Methods: what priors were used for Bayesian analyses? 

      Bayesian statistics were calculated in JASP (JASP Team, 2024) with default prior scales (Cauchy distribution, scale 0.707). This is now added to the Methods.

      - Results, line 162: It states that classifiers were applied on "raw EEG activity" but the Methods specify preprocessing steps. "Preprocessed EEG activity" seems more appropriate. 

      We changed the term to “preprocessed EEG activity” in the Methods and to “(minimally) preprocessed EEG activity (see Methods)” in the  Results, respectively.

      - Results, line 173: The effect of masking on local contrast decoding is reported as "marginal". If the alpha is set at 0.05, it seems that this effect is significant and should not be reported as marginal. 

      We changed the wording from “marginal” to “small but significant.”  

      - Fig1: The fixation cross is not displayed. 

      Because adding the fixation cross would have made the figure of the trial design look crowded and less clear, we decided to exclude it from this schematic trial representation. We are now stating this also in the legend of figure 1.  

      - Fig 3A: In the upper left panel, isn't there a missing significant effect of the "local contrast training and testing" condition in the first window? If not, this condition seems oddly underpowered compared to the other two conditions. 

      Thanks for the catch! The highlighting in bold and the significance bar were indeed lacking for this condition in the upper left panel (blue line). We corrected the figure in our revision.

      - Supplementary text and Fig S6: It is unclear to me why the two control analyses (the black lines vs. the green and purple lines) are pooled together in the same figure. They seem to test for different, non-comparable contrasts (they share neither training nor testing sets), and I find it confusing to find them on the same figure. 

      We agree that this may be confusing, and deleted the results from one control analysis from the figure (black line, i.e., training on contrast, testing on illusion), as the reviewer correctly pointed out that it displayed a non-comparable analysis. Given that this control analysis did not reveal any significant decoding, we now report its results only in the Supplementary text.  

      - Fig S6: I think the title of the legend should say testing on the non-illusory triangle instead of testing on the illusory triangle to match the supplementary text. 

      This was a typo – thank you! Corrected.  

      Reviewer #2 (Recommendations For The Authors): 

      Issue #1: One key asymmetry between the three levels of T2 attributes (i.e.: local contrast; non-illusory triangle; illusory Kanisza triangle) is related to the top-down conscious posture driven by the task that was exclusively focusing on the last attribute (illusory Kanisza triangle). Therefore, any difference in EEG decoding performance across these three levels could also depend to this asymmetry. For instance, if participants were engaged to report local contrast or non-illusory triangle, one could wonder if decoding performance could differ from the one used here. This potential confound was addressed by the authors by using decoders trained in different datasets in which the main task was to report one the two other attributes. They could then test how classifiers trained on the task-related attribute behave on the main dataset. However, this part of the study is crucial but not 100% clear, and the links with the results of these control experiments are not fully explicit. Could the author better clarity this important point (see also Issue #1 and #3). 

      The reviewer raises an important point, alluding to potential differences between decoded features regarding task relevance. There are two separate sets of analyses where task relevance may have been a factor, our main analyses comparing illusion to contrast decoding, and our comparison of collinearity vs. illusion-specific processing.  

      In our main analysis, we are indeed reporting decoding of a task-relevant feature (illusion) and of a task-irrelevant feature (local contrast, i.e., rotation of the Pac-Man inducers). Note, however, that the Pac-Man inducers were always task-relevant, as they needed to be processed to perceive illusory triangles, so that local contrast decoding was based on task-relevant stimulus elements, even though participants did not respond to local contrast differences in the main experiment. However, we also ran control analyses testing the effect of task-relevance on local contrast decoding in our independent training data set and in another (independent) study, where local contrast was, in separate experimental blocks, task-relevant or task-irrelevant. The results are reported in the Supplementary Text and in Figure S5. In brief, task-relevance did not improve early (70–95 ms) decoding of local contrast. We are thus confident that the comparison of local contrast to illusion decoding in our main analysis was not substantially affected by differences in task relevance. In our previous manuscript version, we referred to these control analyses only in the collinearity-vs-illusion section of the Results. In our revision, we added the following in the Results section comparing illusion to contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      In addition to our main analysis, there is the concern that our comparison of collinearity vs. illusion-specific processing may have been affected by differences in task-relevance between the stimuli inducing the non-illusory triangle (the “two-legged white circles”, collinearity-only) and the stimuli inducing the Kanizsa illusion (the PacMan inducers, collinearity-plus-illusion). We would like to emphasize that in our main analysis classifiers were always used to decode T2 illusion presence vs. absence (collinearity-plus-illusion), and never to decode T2 collinearity-only. To distinguish collinearity-only from collinearity-plus-illusion processing, we only varied the training data (training classifiers on collinearity-only or collinearity-plus-illusion), using the independent training data set, where collinearity-only and collinearity-plus-illusion (and rotation) were task-relevant (in separate blocks). As discussed in the Supplementary Information, for this analysis approach to be valid, collinearity-only processing should be similar for the illusory and the non-illusory triangle, and this is what control analyses demonstrated (Fig. S7). In any case, general task-relevance was equated for the collinearity-only and the collinearity-plus-illusion classifiers.  

      Finally, in supplementary Figure 6 we also show that our main results reported in Figure 2 (discussed at the top of this response) were very similar when the classifiers were trained on the independent localizer dataset in which each stimulus feature could be task-relevant.  

      Together, for the reasons described above, we believe that differences in EEG decoding performance across these three stimulus levels did  are unlikely to depend also depend on a “task-relevance” asymmetry.

      Issue #2: Following on my previous point the authors should better mention the concept of conscious influences on unconscious processing that led to a full revision of the notion of automaticity in cognitive science [1 , 2 , 3 , 4]. For instance, the discovery that conscious endogenous temporal and spatial attention modulate unconscious subliminal processing paved the way to this revision. This concept raises the importance of Issue#1: equating performance on the main task across AB and masking is not enough to guarantee that differences of neural processing of the unattended attributes of T2 (i.e.: task-unrelated attributes) are not, in part, due to this asymmetry rather than to a systematic difference of unconscious processing strengtsh [5 , 6-8]. Obviously, the reported differences for real-triangle decoding between AB and masking cannot be totally explained by such a factor (because this is a task-unrelated attribute for both AB and masking conditions), but still this issue should be better introduced, addressed, clarified (Issue #1 and #3) and discussed. 

      We would like to refer to our response to the previous point: Control analyses for local contrast decoding showed that task relevance had no influence on our marker for feedforward processing. Most importantly, as outlined above, we did not perform real-triangle decoding – all our decoding analyses focused on comparing collinearity-only vs. collinearity-plus-illusion were run on the task-relevant T2 illusion (decoding its presence vs. absence). The key difference was solely the training set, where the collinearity-only classifier was trained on the (task-relevant) real triangle and the collinearity-plus-illusion classifier was trained on the (task-relevant) Kanizsa triangle. Thus, overall task relevance was controlled in these analyses.  

      In our revision, we are now also citing the studies proposed by the reviewer, when discussing the control analyses testing for an effect of task-relevance on local contrast decoding:

      “In the light of evidence showing that unconscious processing is susceptible to conscious top-down influences (Kentridge et al., 2004; Kiefer & Brendel, 2006; Naccache et al., 2002), we ran control analyses showing that early local contrast decoding was not improved by rendering contrast task-relevant (see Supplementary Information and Fig. S5), indicating that these differences between illusion and contrast decoding did not reflect differences in task-relevance.”

      Issue #3: In terms of clarity, I would suggest the authors to add a synthetic figure providing an overall view of all pairs of intra and cross-conditions decoding analyses and mentioning main task for training and testing sets for each analysis (see my previous and related points). Indeed, at one point, the reader can get lost and this would not only strengthen accessibility to the detailed picture of results, but also pinpoint the limits of the work (see previous point). 

      We understand the point the reviewer is raising and acknowledge that some of our analyses, in particular those using different training and testing sets, may be difficult to grasp. But given the variety of different analyses using different training and testing sets, different temporal windows, as well as different stimulus features, it was not possible to design an intuitive synthetic figure summarizing the key results. We hope that the added text in the Results and Discussion section will be sufficient to guide the reader through our set of analyses.  

      In our revision, we are now more clearly highlighting that, in addition to presenting the key results in our main text that were based on training classifiers on the T1 data, “we replicated all key findings when training the classifiers on an independent training set where individual stimuli were presented in isolation (Fig. 3A, results in the Supplementary Information and Fig. S6).” For this, we added a schematic showing the procedure of the independent training set to Figure 3, more clearly pointing the reader to the use of a separate training data set.  

      Issue #4: In the light of these findings the authors should discuss more thoroughly the question of unconscious high-level representations in masking versus AB: in particular, a longstanding issue relates to unconscious semantic processing of words, numbers or pictures. According to their findings, they tend to suggest that semantic processing should be more enabled in AB than in masking. However, a rich literature provided a substantial number of results (including results from the last authors Simon Van Gaal) that tend to support the notion of unconscious semantic processing in subliminal processing (see in particular: [9 , 10 , 11 , 12 , 13]). So, and as mentioned by the authors, while there is evidence for semantic processing during AB they should better discuss how they would explain unconscious semantic subliminal processing. While a possibility could be to question the unconscious attribute of several subliminal results, the same argument also holds for AB studies. Another possible track of discussion would be to differentiate AB and subliminal perception in terms of strength and durability of the corresponding unconscious representations, but not necessarily in terms of cognitive richness. Indeed, one may discuss that semantic processing of stimuli that do not need complex spatial integration (e.g.: words or digits as compared to illusory Kanisza tested here) can still be observed under subliminal conditions. 

      We thank the reviewer for pointing us to this shortcoming of our previous Discussion. Note that our data does not directly speak to the question of high-level unconscious representations in masking vs AB, because such conclusions would hinge on the operational definition of consciousness one adheres to (also see response to Reviewer 1). Nevertheless, we do follow the reviewer’s suggestions and added the following in the Discussion (also addressing a point about other forms of attention raised by Reviewer 1):

      “Clearly, these findings do not imply that unconscious high-level (e.g., semantic) processing can only occur during inattention, nor do they necessarily generalize to other forms of inattention. Indeed, while the AB represents a prime example of late attentional filtering, other ways of inducing inattention or distraction (e.g., by manipulating spatial attention) may filter information earlier in the processing hierarchy (e.g., Luck & Hillyard, 1994 vs. Vogel et al., 1998).”

      And, in a following paragraph in the Discussion:

      “Such deep feedforward processing can be sufficient for unconscious high-level processing, as indicated by a rich literature demonstrating high-level (e.g., semantic) processing during masking (Kouider & Dehaene, 2007; Van den Bussche et al., 2009; van Gaal & Lamme, 2012). Thus, rather than enabling high-level unconscious processing, preserved local recurrency during inattention may afford other processing advantages linked to its proposed role in perceptual integration (Lamme, 2020), such as integration of stimulus elements over space or time.  

      Reviewer #3 (Recommendations For The Authors): 

      (1) The objective of Fahrenfort et al., 2017 seems very similar to that of the current study. What are the main differences between the two studies? Moreover, Fahrenfort et al., 2017 conducted similar decoding analyses to those performed in the current study.

      Which results were replicated in the current study, and which ones are novel? Highlighting these differences in the manuscript would be beneficial. 

      We now provide a more comprehensive coverage of the study by Fahrenfort et al., 2017. In the Introduction, we added a brief summary of the key findings, highlighting that this study’s findings could have reflected differences in task performance rather than differences between masking and AB:

      “For example, Fahrenfort and colleagues (2017) found that illusory surfaces could be decoded from electroencephalogram (EEG) data during the AB but not during masking. This was taken as evidence that local recurrent interactions, supporting perceptual integration, were preserved during inattention but fully abolished by masking. However, masking had a much stronger behavioral effect than the AB, effectively reducing task performance to chance level. Indeed, a control experiment using weaker masking, which resulted in behavioral performance well above chance similar to the main experiment’s AB condition, revealed some evidence for preserved local recurrent interactions also during masking. However, these conditions were tested in separate experiments with small samples, precluding a direct comparison of perceptual vs. attentional blindness at matched levels of behavioral performance. To test …”

      In the Results , we are now also highlighting this key advancement by directly referencing the previous study:

      “Thus, whereas in previous studies task performance was considerably higher during the AB than during masking (e.g., Fahrenfort et al., 2017), in the present study the masked and the AB condition were matched in both measures of conscious access.” When reporting the EEG decoding results in the Results section, we continuously cite the Fahrenfort et al. (2017) study to highlight similarities in the study’s findings. We also added a few sentences explicitly relating the key findings of the two studies:

      “This suggests that the AB allowed for greater local recurrent processing than masking, replicating the key finding by Fahrenfort and colleagues (2017). Importantly, the present result demonstrates that this effect reflects the difference between the perceptual vs. attentional manipulation rather than differences in behavior, as the masked and the AB condition were matched for perceptual performance and metacognition.”

      “This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”  

      We also more clearly highlighted where our study goes beyond Fahrenfort et al.’s (2017), e.g., in the Results:

      “The addition of this element of collinearity to our stimuli was a key difference to the study by Fahrenfort and colleagues (2017), allowing us to compare non-illusory triangle decoding to illusory triangle decoding in order to distinguish between collinearity and illusion-specific processing.”

      And in the Discussion:

      “Furthermore, the addition of line segments forming a non-illusory triangle to the stimulus employed in the present study allowed us to distinguish between collinearity and illusion-specific processing.”

      Also, in the Discussion, we added a paragraph “summarizing which results were replicated in the current study, and which ones are novel”, as suggested by the reviewer:

      “This pattern of results is consistent with a previous study that used EEG to decode Kanizsa-like illusory surfaces during masking and the AB (Fahrenfort et al., 2017). However, the present study also revealed some effects where Fahrenfort and colleagues (2017) failed to obtain statistical significance, likely reflecting the present study’s considerably larger sample size and greater statistical power. For example, in the present study the marker for feedforward processing was weakly but significantly impaired by masking, and the marker for local recurrency was significantly impaired not only by masking but also by the AB, although to a lesser extent. Most importantly, however, we replicated the key findings that local recurrent processing was more strongly impaired by masking than by the AB, and that global recurrent processing was similarly impaired by masking and the AB and closely linked to task performance, reflecting conscious access. Crucially, having matched the key conditions behaviorally, the present finding of greater local recurrency during the AB can now unequivocally be attributed to the attentional vs. perceptual manipulation of consciousness.”

      Finally, we changed the title to “Distinct neural mechanisms underlying perceptual and attentional impairments of conscious access despite equal task performance” to highlight one of the crucial differences between the Fahrenfort et al., study and this study, namely the fact that we equalized task performance between the two critical conditions (AB and masking).

      (2) It is not clear from the text the link between the current study and the literature on the role of lateral and feedback connections in consciousness (Lamme, 2020). A better explanation is needed. 

      To our knowledge, consciousness theories such as recurrent processing theory by Lamme make currently no distinction between the role of lateral and feedback connections for consciousness. The principled distinction lies between unconscious feedforward processing and phenomenally conscious or “preconscious” local recurrent processing, where local recurrency refers to both lateral (or horizontal) and feedback connections. We added a sentence in the Discussion:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness …”

      (3) When training on T1 and testing on T2, EEG data showed an early peak in local contrast classification at 75-95 ms over posterior electrodes. The authors stated that this modulation was only marginally affected by masking (and not at all by AB); however, the main effect of masking is significant. Why was this effect interpreted as nonrelevant? 

      Following this and Reviewer 1’s comment, we changed the wording from “marginal” to “weak but significant.” We considered this effect “weak” and of lesser relevance, because its Bayes factor indicated that the alternative hypothesis was only 1.31 times more likely than the null hypothesis of no effect, representing only “anecdotal” evidence, which is in sharp contrast to the robust effects of the consciousness manipulations on illusion decoding reported later. Furthermore, later ANOVAs comparing the effect of masking on contrast vs. illusion decoding revealed much stronger effects on illusion decoding than on contrast decoding (BFs>3.59×10<sup>4</sup>).

      (4) The decoding analysis on the illusory percept yielded two separate peaks of decoding, one from 200 to 250 ms and another from 275 to 475 ms. The early component was localized occipitally and interpreted as local sensory processing, while the late peak was described as a marker for global recurrent processing. This latter peak was localized in the parietal cortex and associated with the P300. Can the authors show the topography of the P300 evoked response obtained from the current study as a comparison? Moreover, source reconstruction analysis would probably provide a better understanding of the cortical localization of the two peaks. 

      Figure S4 now shows the P300 from electrode Pz, demonstrating a stronger positivity between 375 and 475 ms when the illusory triangle was present than when it was absent. We did not run a source reconstruction analysis.  

      (5) The authors mention that the behavioural results closely resembled the pattern of the second decoding peak results. However, they did not show any evidence for this relationship. For instance, is there a correlation between the two measures across or within participants? Does this relationship differ between the illusion report and the confidence rating? 

      This relationship became evident from simply eyeballing the results figures: Both in behavior and EEG decoding performance dropped from the both-manipulations condition to the AB and masked conditions, while these conditions did not differ significantly. Following a similar observation of a close similarity between behavior and the second/late illusion decoding peak in the study by Fahrenfort et al. (2017), we adopted their analysis approach and ran two additional ANOVAs, adding “measure” (behavior vs. EEG) as a factor. For this analysis, we dropped the both-manipulations condition due to scale restrictions (as noted in footnote 1: “We excluded the bothmanipulations condition from this analysis due to scale restrictions: in this condition, EEG decoding at the second peak was at chance, while behavioral performance was above chance, leaving more room for behavior to drop from the masked and AB condition.”). The analysis revealed that there were no interactions with condition:

      “The pattern of behavioral results, both for perceptual performance and metacognitive sensitivity, closely resembled the second decoding peak: sensitivity in all three metrics dropped from the no-manipulations condition to the masked and AB conditions, while sensitivity did not differ significantly between these performancematched conditions (Fig. 2C). Two additional rm ANOVAs with the factors measure (behavior, second EEG decoding peak) and condition (no-manipulations, masked, AB)<sup>1</sup> for perceptual performance and metacognitive sensitivity revealed no significant interaction (performance: F</iv><sub>2,58</sub>=0.27, P\=0.762, BF<sub>01</sub>=8.47; metacognition: F</iv><sub>2,58</sub=0.54, P\=0.586, BF<sub>2,58</sub>=6.04). This similarity between behavior and EEG decoding replicates the findings of Fahrenfort and colleagues  (2017) who also found a striking similarity between late Kanizsa decoding (at 406 ms) and behavioral Kanizsa detection. These results indicate that global recurrent processing at these later points in time reflected conscious access to the Kanizsa illusion.”

      (6) The marker for illusion-specific processing emerged later (200-250 ms), with the nomanipulation decoding performing better after training on the illusion than the nonillusory triangle. This difference emerged only in the AB condition, and it was fully abolished by masking. The authors confirmed that the illusion-specific processing was not affected by the AB manipulations by running a rm ANOVA which did not result in a significant interaction between condition and training set. However, unlike the other non-significant results, a Bayes Factor is missing here. 

      We added Bayes factors to all (significant and non-significant) rm ANOVAs.

      (7) The same analysis yielded a second illusion decoding peak at 375-475 ms. This effect was impaired by both masking and AB, with no significant differences between the two conditions. The authors stated that this result was directly linked to behavioural performance. However, it is not clear to me what they mean (see point 5). 

      We added analyses comparing behavior and EEG decoding directly (see our response to point 5).

      (8) The introduction starts by stating that perceptual and attentional processes differently affect consciousness access. This differentiation has been studied thoroughly in the consciousness literature, with a focus on how attention differs from consciousness (e.g., Koch & Tsuchiya, TiCS, 2007; Pitts, Lutsyshyna & Hillyard, Phil. Trans. Roy. Soc. B Biol. Sci., 2018). The authors stated that "these findings confirm and enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness clearly distinguishing and specifying the neural profiles of each processing stage of the influential four-stage model of conscious experience". I found it surprising that this aspect was not discussed further. What was the state of the art before this study was conducted? What are the mentioned neural profiles? How did the current results enrich the literature on this topic? 

      We would like to point out that our study is not primarily concerned with the conceptual distinction between consciousness and attention, which has been the central focus of e.g., Koch and Tsuchiuya (2007). While this literature was concerned with ways to dissociate consciousness and attention, we tacitly assumed that attention and consciousness are now generally considered as different constructs. Our study is thus not dealing with dissociations between attention and consciousness, nor with the distinction between phenomenal consciousness and conscious access, but is concerned with different ways of impairing conscious access (defined as the ability to report about a stimulus), either via perceptual or via attentional manipulations. For the state of the art before the study was conducted, we would like to refer to the motivation of our study in the Introduction, e.g., previous studies’ difficulties in unequivocally linking greater local recurrency during attentional than perceptual blindness to the consciousness manipulation, given performance confounds (we expanded this Introduction section). We also expanded a paragraph in the discussion to remind the reader of the neural profiles of the 4-stage model and to highlight the novelty of our findings related to the distinction between lateral and feedback processes:

      “As current theories do not distinguish between the roles of lateral vs. feedback connections for consciousness, the present findings may enrich empirical and theoretical work on perceptual vs. attentional mechanisms of consciousness (Block, 2005; Dehaene et al., 2006; Hatamimajoumerd et al., 2022; Lamme, 2010; Pitts et al., 2018; Sergent & Dehaene, 2004), clearly distinguishing the neural profiles of each processing stage of the influential four-stage model of conscious experience (Fig. 1A). Along with the distinct temporal and spatial EEG decoding patterns associated with lateral and feedback processing, our findings suggest a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-tofeedback connections, ultimately leading to global recurrency and conscious report.”  

      (9) When stating that this is the first study in which behavioural measures of conscious perception were matched between the attentional blink and masking, it would be beneficial to highlight the main differences between the current study and the one from Fahrenfort et al., 2017, with which the current study shares many similarities in the experimental design (see point 1). 

      We would like to refer the reviewer to our response to point 1), where we detail how we expanded the discussion of similarities and differences between our present study and Fahrenfort et al. (2017).

      (10) The discussion emphasizes how the current study "suggests a processing sequence from feedforward processing to local recurrent interactions encompassing lateral-to-feedback connections, ultimately leading to global recurrency and conscious report". For transparency, it is though important to highlight that one limit of the current study is that it does not provide direct evidence for the specified types of connections (see point 6). 

      We added a qualification in the Discussion section:

      “Although the present EEG decoding measures cannot provide direct evidence for feedback vs. lateral processes, based on neurophysiological evidence, …”

      Furthermore, we added this qualification in the Discussion section:

      “It should be noted that the not all neurophysiological evidence unequivocally links processing of collinearity and of the Kanizsa illusion to lateral and feedback processing, respectively (Angelucci et al., 2002; Bair et al., 2003; Chen et al., 2014), so that overlap in decoding the illusory and non-illusory triangle may reflect other mechanisms, for example feedback processing as well.”

      References

      Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J.-M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 22(19), 8633–8646.

      Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 23(20), 7690–7701.

      Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9(2), 46–52.

      Chen, M., Yan, Y., Gong, X., Gilbert, C. D., Liang, H., & Li, W. (2014). Incremental integration of global contours through interplay between visual cortical areas. Neuron, 82(3), 682–694.

      Dehaene, S., Changeux, J.-P., Naccache, L., Sackur, J., & Sergent, C. (2006). Conscious, preconscious, and subliminal processing: a testable taxonomy. Trends in Cognitive Sciences, 10(5), 204–211.

      Hatamimajoumerd, E., Ratan Murty, N. A., Pitts, M., & Cohen, M. A. (2022). Decoding perceptual awareness across the brain with a no-report fMRI masking paradigm. Current Biology: CB. https://doi.org/10.1016/j.cub.2022.07.068

      JASP Team. (2024). JASP (Version 0.19.0)[Computer software]. https://jasp-stats.org/ Kentridge, R. W., Heywood, C. A., & Weiskrantz, L. (2004). Spatial attention speeds discrimination without awareness in blindsight. Neuropsychologia, 42(6), 831– 835.

      Kiefer, M., & Brendel, D. (2006). Attentional Modulation of Unconscious “Automatic” Processes: Evidence from Event-related Potentials in a Masked Priming Paradigm. Journal of Cognitive Neuroscience, 18(2), 184–198.

      Kouider, S., & Dehaene, S. (2007). Levels of processing during non-conscious perception: a critical review of visual masking. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 857–875.

      Lamme, V. A. F. (2010). How neuroscience will change our view on consciousness. Cognitive Neuroscience, 1(3), 204–220.

      Luck, S. J., & Hillyard, S. A. (1994). Electrophysiological correlates of feature analysis during visual search. Psychophysiology, 31(3), 291–308.

      Naccache, L., Blandin, E., & Dehaene, S. (2002). Unconscious masked priming depends on temporal attention. Psychological Science, 13(5), 416–424.

      Pitts, M. A., Lutsyshyna, L. A., & Hillyard, S. A. (2018). The relationship between attention and consciousness: an expanded taxonomy and implications for ‘noreport’ paradigms. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 373(1755), 20170348.

      Sergent, C., & Dehaene, S. (2004). Is consciousness a gradual phenomenon? Evidence for an all-or-none bifurcation during the attentional blink. Psychological Science, 15(11), 720–728.

      Van den Bussche, E., Van den Noortgate, W., & Reynvoet, B. (2009). Mechanisms of masked priming: a meta-analysis. Psychological Bulletin, 135(3), 452–477. van Gaal, S., & Lamme, V. A. F. (2012). Unconscious high-level information processing: implication for neurobiological theories of consciousness: Implication for neurobiological theories of consciousness. The Neuroscientist: A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 18(3), 287–301.

      Vogel, E. K., Luck, S. J., & Shapiro, K. L. (1998). Electrophysiological evidence for a postperceptual locus of suppression during the attentional blink. Journal of Experimental Psychology. Human Perception and Performance, 24(6), 1656– 1674.

    1. Reviewer #2 (Public review):

      Summary:

      This work by den Bakker and Kloosterman contributes to the vast body of research exploring the dynamics governing the communication between the hippocampus (HPC) and the medial prefrontal cortex (mPFC) during spatial learning and navigation. Previous research showed that population activity of mPFC neurons is replayed during HPC sharp-wave ripple events (SWRs), which may therefore correspond to privileged windows for the transfer of learned navigation information from the HPC, where initial learning occurs, to the mPFC, which is thought to store this information long term. Indeed, it was also previously shown that the activity of mPFC neurons contains task-related information that can inform about the location of an animal in a maze, which can predict the animals' navigational choices. Here, the authors aim to show that the mPFC neurons that are modulated by HPC activity (SWRs and theta rhythms) are distinct from those "encoding" spatial information. This result could suggest that the integration of spatial information originating from the HPC within the mPFC may require the cooperation of separate sets of neurons.

      This observation may be useful to further extend our understanding of the dynamics regulating the exchange of information between the HPC and mPFC during learning. However, my understanding is that this finding is mainly based upon a negative result, which cannot be statistically proven by the failure to reject the null hypothesis. Moreover, in my reading, the rest of the paper mainly replicates phenomena that have already been described, with the original reports not correctly cited. My opinion is that the novel elements should be precisely identified and discussed, while the current phrasing in the manuscript, in most cases, leads readers to think that these results are new. Detailed comments are provided below.

      Major concerns:

      (1) The main claim of the manuscript is that the neurons involved in predicting upcoming choices are not the neurons modulated by the HPC. This is based upon the evidence provided in Figure 5, which is a negative result that the authors employ to claim that predictive non-local representations in the mPFC are not linked to hippocampal SWRs and theta phase. However, it is important to remember that in a statistical test, the failure to reject the null hypothesis does not prove that the null hypothesis is true. Since this claim is so central in this work, the authors should use appropriate statistics to demonstrate that the null hypothesis is true. This can be accomplished by showing that there is no effect above some size that is so small that it would make the effect meaningless (see https://doi.org/10.1177/070674370304801108).

      (2) The main claim of the work is also based on Figure 3, where the authors show that SWRs-unmodulated mPFC neurons have higher spatial tuning, and higher directional selectivity scores, and a higher percentage of these neurons show theta skipping. This is used to support the claim that SWRs-unmodulated cells encode spatial information. However, it must be noted that in this kind of task, it is not possible to disentangle space and specific task variables involving separate cognitive processes from processing spatial information such as decision-making, attention, motor control, etc., which always happen at specific locations of the maze. Therefore, the results shown in Figure 3 may relate to other specific processes rather than encoding of space and it cannot be unequivocally claimed that mPFC neurons "encode spatial information". This limitation is presented by Mashoori et al (2018), an article that appears to be a major inspiration for this work. Can the authors provide a control analysis/experiment that supports their claim? Otherwise, this claim should be tempered. Also, the authors say that Jadhav et al. (2016) showed that mPFC neurons unmodulated by SWRs are less tuned to space. How do they reconcile it with their results?

      (3) My reading is that the rest of the paper mainly consists of replications or incremental observations of already known phenomena with some not necessarily surprising new observations:<br /> a) Figure 2 shows that a subset of mPFC neurons is modulated by HPC SWRs and theta (already known), that vmPFC neurons are more strongly modulated by SWRs (not surprising given anatomy), and that theta phase preference is different between vmPFC and dmPFC (not surprising given the fact that theta is a travelling wave).<br /> b) Figure 4 shows that non-local representations in mPFC are predictive of the animal's choice. This is mostly an increment to the work of Mashoori et al (2018). My understanding is that in addition to what had already been shown by Mashoori et al here it is shown how the upcoming choice can be predicted. The author may want to emphasize this novel aspect.<br /> c) Figure 6 shows that prospective activity in the HPC is linked to SWRs and theta oscillations. This has been described in various forms since at least the works of Johnson and Redish in 2007, Pastalkova et al 2008, and Dragoi and Tonegawa (2011 and 2013), as well as in earlier literature on splitter cells. These foundational papers on this topic are not even cited in the current manuscript.<br /> Although some previous work is cited, the current narrative of the results section may lead the reader to think that these results are new, which I think is unfair. Previous evidence of the same phenomena should be cited all along the results and what is new and/or different from previous results should be clearly stated and discussed. Pure replications of previous works may actually just be supplementary figures. It is not fair that the titles of paragraphs and main figures correspond to notions that are well established in the literature (e.g., Figure 2, 2nd paragraph of results, etc.).<br /> d) My opinion is that, overall, the paper gives the impression of being somewhat rushed and lacking attention to detail. Many figure panels are difficult to understand due to incomplete legends and visualizations with tiny, indistinguishable details. Moreover, some previous works are not correctly cited. I tried to make a list of everything I spotted below.

    1. Reviewer #2 (Public review):

      Summary:

      Dr. Adam Kim and collaborators study the changes in chromatin structure in monocytes obtained from alcohol-associated hepatitis (AH) when compared to healthy controls (HC). Through the usage of high throughput chromatin conformation capture technology (Hi-C), they collected data on contact frequencies between both contiguous and distal DNA windows (100 kB each); mainly within the same chromosome. From the analyses of those data in the two cohorts under analysis, authors describe frequent pairs of regions subject to significant changes in contact frequency across cohorts. Their accumulation onto specific regions of the genome -referred to as hotspots- motivated authors to narrow down their analyses to these disease-associated regions, in many of which, authors claim, a number of key innate immune genes can be found. Ultimately, the authors try to draw a link between the changes observed in chromatin architecture in some of these hotspots and the differential co-expression of the genes lying within those regions, as ascertained in previous single-cell transcriptomic analyses.

      Strengths:

      The main strength of this paper lies in the generation of Hi-C data from patients, a valuable asset that, as the authors emphasize, offers critical insights into the role of chromatin architecture dysregulation in the pathogenesis of alcohol-associated hepatitis (AH). If confirmed, the reported findings have the potential to highlight an important, yet overlooked, aspect of cellular dysregulation-chromatin conformation changes - not only in AH but potentially in other immune-related conditions with a component of pathological inflammation.

      Weaknesses:

      In what I regard as the two most important weaknesses of the work, I feel that they are more methodological than conceptual. The first of these issues concerns the perhaps insufficient level of description provided on the definition of some key types of genomic regions, such as topologically associated domains, DNA hotspots, or even DNA loci showing significant changes in contact frequency between AH and HC. In spite of the importance of these concepts in the paper, no operational, explicit description of how are they defined, from a statistical point of view, is provided in the current version of the manuscript.

      Without these definitions, some of the claims that authors make in their work become hard to sustain. Some examples are the claim that randomizing samples does not lead to significant differences between cohorts; the claim that most of the changes in contact frequency happen locally; or the claim that most changes do not alter the structure of TADs, but appear either within, or between TADs. In my viewpoint, specific descriptions and implementation of proper tests to check these hypotheses and back up the mentioned specific claims, along with the inclusion of explicit results on these matters, would contribute very significantly to strengthening the overall message of the paper.

      The second notable weakness of the study pertains to the characterization of the changes observed around immune genes in relation to genome-wide expectations. Although the authors suggest that certain hotspots contain a high number of immune-related genes, no enrichment analysis is provided to verify whether these regions indeed harbor a higher concentration of such genes compared to other genomic areas. It would be important for readers to be promptly informed if no such enrichment is observed, for in that case, the presence of some immune genes within these hotspots would carry more limited implications.

      Additionally, the criteria used to define a hotspot are not clearly outlined, making it difficult to assess whether the changes in contact frequencies around the immune genes highlighted in figures 5-8 are truly more pronounced than what would be expected genome-wide.

  5. pkwaredownloads.blob.core.windows.net pkwaredownloads.blob.core.windows.net
    1. If one of the fields in the end of central directory record is too small to hold required data, the field should be set to -1 (0xFFFF or 0xFFFFFFFF)

      This is a weird way to put it, since the very first note given in this section is that all fields unless otherwise noted are unsigned...

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable study combined whole-head magnetoencephalography (MEG) and subthalamic (STN) local field potential (LFP) recordings in patients with Parkinson's disease undergoing deep brain stimulation surgery. The paper provides solid evidence that cortical and STN beta oscillations are sensitive to movement context and may play a role in the coordination of movement redirection.

      We are grateful for the expert assessment by the editor and the reviewers. Below we provide pointby-point replies to both public and private reviews. We have tried to keep the answers in the public section short and concise, not citing the changed passages unless the point does not re-appear in the recommendations. There, we did include all of the changes to the manuscript, such that the reviewers need not go back and forth between replies and manuscript.

      The reviewer comments have not only led to numerous improvements of the text, but also to new analyses, such as Granger causality analysis, and to methodological improvements e.g. including numerous covariates in the statistical analyses. We believe that the article improved substantially through the feedback, and we thank the reviewers and the editor for their effort.

      Public Reviews

      Reviewer #1 (Public review):

      Summary:

      Winkler et al. present brain activity patterns related to complex motor behaviour by combining wholehead magnetoencephalography (MEG) with subthalamic local field potential (LFP) recordings from people with Parkinson's disease. The motor task involved repetitive circular movements with stops or reversals associated with either predictable or unpredictable cues. Beta and gamma frequency oscillations are described, and the authors found complex interactions between recording sites and task conditions. For example, they observed stronger modulation of connectivity in unpredictable conditions. Moreover, STN power varied across patients during reversals, which differed from stopping movements. The authors conclude that cortex-STN beta modulation is sensitive to movement context, with potential relevance for movement redirection.

      Strengths:

      This study employs a unique methodology, leveraging the rare opportunity to simultaneously record both invasive and non-invasive brain activity to explore oscillatory networks.

      Weaknesses:

      It is difficult to interpret the role of the STN in the context of reversals because no consistent activity pattern emerged.

      We thank the reviewer for the valuable feedback to our study. We agree that the interpretation of the role of the STN during reversals is rather difficult, because reversal-related STN activity was highly variable across patients. Although there seem to be consistent patterns in sub-groups of the current cohort, with some patients showing event-related increases (Fig. 3b) and others showing decreases, the current dataset is not large enough to substantiate or even explain the existence of such clusters. Thus, we limit ourselves to acknowledging this limitation and discussing potential reasons for the high variability, namely variability in electrode placement and insufficient spatial resolution for the separation of specialized cell ensembles within the STN (see Discussion, section Limitations and future directions).

      Reviewer #2 (Public review):

      Summary:

      This study examines the role of beta oscillations in motor control, particularly during rapid changes in movement direction among patients with Parkinson's disease. The researchers utilized magnetoencephalography (MEG) and local field potential (LFP) recordings from the subthalamic nucleus to investigate variations in beta band activity within the cortex and STN during the initiation, cessation, and reversal of movements, as well as the impact of external cue predictability on these dynamics. The primary finding indicates that beta oscillations more effectively signify the start and end of motor sequences than transitions within those sequences. The article is well-written, clear, and concise.

      Strengths:

      The use of a continuous motion paradigm with rapid reversals extends the understanding of beta oscillations in motor control beyond simple tasks. It offers a comprehensive perspective on subthalamocortical interactions by combining MEG and LFP.

      Weaknesses:

      (1) The small and clinically diverse sample size may limit the robustness and generalizability of the findings. Additionally, the limited exploration of causal mechanisms reduces the depth of its conclusions and focusing solely on Parkinson's disease patients might restrict the applicability of the results to broader populations.

      We thank the reviewer for the insightful feedback. We address these issues one by one in our responses to points 2, 4 and 6, respectively.

      (2) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.

      Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size and cohort stratification. We now acknowledge this in the revised discussion (section Limitations and future directions). Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (Discussion, section Limitations and future directions). Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.

      (3) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.

      Thanks for the suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up t-tests.

      (4) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.

      We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band. The detailed results can be viewed on p. 16 in the section on Granger causality. For statistical testing, we conducted an rmANCOVA, similar to those for power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6-figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).

      (5) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.

      We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.

      (6) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.

      We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS in our centre makes this an unfeasible option in practical terms. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.

      That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication and dopamine has been found to normalize pathological alterations of beta activity. Further, the general pattern of movement-related beta and gamma oscillations reported here has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants.

      Reviewer #3 (Public review):

      Summary:

      The study highlights how the initiation, reversal, and cessation of movements are linked to changes in beta synchronization within the basal ganglia-cortex loops. It was observed that different movement phases, such as starting, stopping briefly, and stopping completely, affect beta oscillations in the motor system.

      It was found that unpredictable cues lead to stronger changes in STN-cortex beta coherence. Additionally, specific patterns of beta and gamma oscillations related to different movement actions and contexts were observed. Stopping movements was associated with a lack of the expected beta rebound during brief pauses within a movement sequence.

      Overall, the results underline the complex and context-dependent nature of motor-control and emphasize the role of beta oscillations in managing movement according to changing external cues.

      Strengths:

      The paper is very well written, clear, and appears methodologically sound.

      Although the use of continuous movement (turning) with reversals is more naturalistic than many previous button push paradigms.

      Weaknesses:

      The generalizability of the findings is somewhat curtailed by the fact that this was performed perioperatively during the period of the microlesion effect. Given the availability of sensing-enabled DBS devices now and HD-EEG, does MEG offer a significant enough gain in spatial localizability to offset the fact that it has to be done shortly postoperatively with externalized leads, with an attendant stun effect? Specifically, for paradigms that are not asking very spatially localized questions as a primary hypothesis?

      We appreciate the reviewer’s feedback and acknowledge the valid point raised on the timing of our measurements. Indeed, sensing-enabled devices offer a valid alternative to peri-operative recordings, circumventing the stun effect. We acknowledge this in the revised discussion, section Limitations and future directions (p. 23): “Additionally, future research could capitalize on sensingcapable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).” This alternative strategy, however, was not an option here because we did not have a sufficient number of patients implanted with sensing-enabled devices at the time when the data collection was initialized.

      That being said, we would like to highlight that in the present study, our goal was not to study pathology related to Parkinson’s disease. Rather, we aimed to learn about motor control in general. The stun effect may have facilitated motor performance in our patients, which is actually beneficial to the research goals at hand.

      Further investigation of the gamma signal seems warranted, even though it has a slightly lower proportional change in amplitude in beta. Given that the changes in gamma here are relatively wide band, this could represent a marker of neural firing that could be interestingly contrasted against the rhythm account presented.

      We appreciate the reviewer’s interest and we have extended the investigation of gamma oscillations. We now provide statistics regarding the influence of predictability on gamma power and gamma coherence (no significant effects) and explore Granger causality in the gamma (and beta) band (see comment 4 of reviewer 2). Unfortunately, we cannot measure spiking via the DBS electrode, and therefore we cannot investigate correlations between gamma oscillatory activity and action potentials. We do agree with the reviewer, however, that action potentials rather than oscillations form the basis of motor control in the brain. This view of ours is now reflected in the revised discussion, section Limitations and future directions (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This is a well-conducted study and overall the results are clear. I only have one minor suggestion for improvement of the manuscript. I found the order of appearance of the results somewhat confusing, switching from predictability-related behavioral effects to primarily stopping and reversal-related neurophysiological effects, back to predictability but starting with coherence. I would suggest that the authors try to follow a systematic order focused on the questions at hand. E.g. perhaps readability could be improved if the results section is split into reversal vs. stopping related effects, reporting behavior, power, and coherence in this order, followed by a predictability section, again reporting behavior, power, and coherence. Obviously, this is an optional suggestion. Apart from that, I just missed a more direct message related to the absence of statistical significance related to STN power changes during reversal. I think this could be made more clear in the text.

      We thank the reviewer for the feedback to our study. In order to ease reading, we modified the order and further added additional sub-titles to the results section. We start with Behavior (p. 4) and then move on to Power (general movement effects on power – movement effects on STN power – movement effects on cortical power – predictability effects on power). Next, we move on to Connectivity (movement effects on connectivity – predictability effects on connectivity – Granger causality). We hope that these adaptations will help guide the reader.

      Additionally, we thank the reviewer for noting that we did not explicitly mention the lack of statistical significance of reversal-related beta power modulations in the STN. We have adapted the section on modulation of STN beta power associated with reversals (p. 8) to: “In the STN, reversals were associated with a brief modulation of beta power, which was weak in the group-average spectrum and did not reach significance (Fig. 3a).”

      Reviewer #2 (Recommendations for the authors):

      (1) The small sample size and variability in clinical characteristics among patients may limit the robustness of the study's conclusions. It would be beneficial for the authors to acknowledge this limitation and propose strategies for addressing it in future research. Additionally, incorporating patient-specific factors as covariates in the ANOVA could help mitigate the confounding effects of heterogeneity.

      Thank you for this comment. The challenges associated with recording brain activity peri-operatively can be a limiting factor when it comes to sample size. We now acknowledge this in the revised discussion, section Limitations and future directions (p. 20):

      “Invasive measurements of STN activity are only possible in patients who are undergoing or have undergone brain surgery. Studies drawing from this limited pool of candidate participants are typically limited in terms of sample size and cohort stratification, particularly when carried out in a peri-operative setting. Here, we had a sample size of 20, which is rather high for a peri-operative study, but still low in terms of absolute numbers.”

      Furthermore, we suggest using sensing-capable devices in the future as a measure to increase sample sizes (p. 21):

      “Additionally, future research could capitalize on sensing-capable devices to circumvent the necessity to record brain activity peri-operatively, facilitating larger sample sizes and circumventing the stun effect, an immediate improvement in motor symptoms arising as a consequence of electrode implantation (Mann et al., 2009).”

      Lastly, we appreciate the idea of adding patient-specific factors as covariates to the ANOVAs and have thus included age, disease duration and pre-surgical UPDRS score into our models. This did not lead to any qualitative changes of statistical effects.

      Revised article

      Methods, Statistical analysis:

      “To account for their potential influence on brain activity, we added age, pre-operative UPDRS score, and disease duration as covariates to all ANOVAs. Covariates were standardized by means of zscoring.”

      (2) The author may consider using standardized statistics, such as effect size, that would provide a clearer picture of the observed effect magnitude and improve comparability.

      Thanks for this useful suggestion. As measures of effect size, we have added partial eta squared (η<sub>p</sub><sup2</sup>) to the results of all ANOVAs and Cohen’s d to all follow-up _t-_tests.

      (3) Although the study identifies relevance between beta activity and motor events, it lacks causal analysis and discussion of potential causal mechanisms. Given the valuable datasets collected, exploring or discussing causal mechanisms would enhance the depth of the study.

      We appreciate this idea and have conducted Granger causality analyses in response to this comment. This new analysis reveals that there is a strong cortical drive to the STN for all movements of interest and predictability conditions in the beta band, but no directed interactions in the gamma band. For statistical testing, we conducted an rmANCOVA, similar to the analysis of power and coherence (see p. 46-48 and 54-56 for the corresponding tables), as well as t-tests assessing directionality (Figure 6 figure supplement 2 on p. 35). In the discussion section, we connect these results with prior findings suggesting that the frontal cortex drives the STN in the beta band, likely through hyperdirect pathway fibers (p. 17).

      Revised article

      Methods Section, Granger Causality Analysis

      “We computed beta and gamma band non-parametric Granger causality (Dhamala, Rangarajan, & Ding, 2008) between cortical ROIs and the STN in the hemisphere contralateral to movement for the post-event time windows (0 – 2 s with respect to start, reversal, and stop). Because estimates of Granger causality are often biased, we compared the original data to time-reversed data to suppress non-causal interactions. True directional influence is reflected by a higher causality measure in the original data than in its time-reversed version, resulting in a positive difference between the two, the opposite being the case for a signal that is “Granger-caused” by the other. Directionality is thus reflected by the sign of the estimate (Haufe, Nikulin, Müller, & Nolte, 2013). Because rmANCOVA results indicated no significant effects for predictability and movement type, and post-hoc tests did not detect significant differences between hemispheres, we averaged Granger causality estimates over movement types, hemispheres and predictability conditions in Figure 6-figure supplement 2.”

      Results, Granger causality

      “In general, cortex appeared to drive the STN in the beta band, regardless of the movement type and predictability condition. This was reflected in a main effect of ROI on Granger causality estimates (F<sub>ROI</sub>(7,9) = 3.443, p<sub>ROI</sub> = 0.044, η<sub>p</sub><sup2</sup> = 0.728; refer to Supplementary File 4 for the full results of the ANOVA). In the hemisphere contralateral to movement, follow-up t-tests revealed significantly higher Granger causality estimates from M1 to the STN (t = 3.609, one-sided p < 0.001, d = 0.807) and from MSMC to the STN (t = 2.051, one-sided p < 0.027, d = 0.459) than the other way around. The same picture emerged in the hemisphere ipsilateral to movement (M1 to STN: t = 3.082, one-sided p = 0.003, d = 0.689; MSMC to STN: t \= 1.833, one-sided p < 0.041, d = 0.410). In the gamma band, we did not detect a significant drive from one area to the other (F<sub>ROI</sub>(7,9) = 0.338, p<sub>ROI</sub> = 0.917, η<sub>p</sub><sup2</sup> = 0.208, Supplementary File 6). Figure 6-figure supplement 2 demonstrates the differences in Granger causality between original and time-reversed data for the beta and gamma band.”

      Discussion, The dynamics of STN-cortex coherence

      “Considering the timing of the increase observed here, the STN’s role in movement inhibition (Benis et al., 2014; Ray et al., 2012) and the fact that frontal and prefrontal cortical areas are believed to drive subthalamic beta activity via the hyperdirect pathway (Chen et al., 2020; Oswal et al., 2021) it seems plausible that the increase of beta coherence reflects feedback of sensorimotor cortex to the STN in the course of post-movement processing. In line with this idea, we observed a cortical drive of subthalamic activity in the beta band.”

      (4) The study cohort focused on senior adults, who may exhibit age-related cortical responses during movement planning in neural mechanisms. These aspects were not discussed in the study.

      We appreciate the comment and agree that age may have impacted neural oscillatory activity of patients in the present study. We now acknowledge this in the limitations section, and point out that our approach to handling these effects was including age as a covariate in the statistical analyses.

      Revised article

      Discussion, Limitations and Future Directions

      “Further, most of our participants were older than 60 years. To diminish any confounding effects of age on movement-related modulations of neural oscillations, such as beta suppression and rebound (Bardouille & Bailey, 2019; Espenhahn et al., 2019), we included age as a covariate in the statistical analyses.”

      (5) Including a control group of patients with other movement disorders who also undergo DBS surgery would be beneficial. Because we cannot exclude the possibility that the observed findings are specific to PD or can be generalized. Additionally, the current title and the article, which are oriented toward understanding human motor control, may not be appropriate.

      We thank the reviewer for this comment and fully agree that it cannot be ruled out that the present findings are, in part, specific to PD. We acknowledge this limitation in the Limitations and future directions section (p. 20-21). Indeed, including a control group of patients with other disorders would be ideal, but the scarcity of patients with diseases other than PD who receive STN DBS makes this an unfeasible option. We do suggest that future research may address this issue by extending our approach to different disorders or healthy participants on the cortical level (p. 21). Lastly, we appreciate the idea to adjust the title of the present article. The adjusted title is: “Context-Dependent Modulations of Subthalamo-Cortical Synchronization during Rapid Reversals of Movement Direction in Parkinson’s Disease”.

      That being said, we do believe that our findings at least approximate healthy functioning and are not solely related to PD. For one, patients were on their usual dopaminergic medication for the study and dopamine has been found to normalize pathological alterations of beta activity. More importantly, the general pattern of movement-related beta and gamma oscillations has been observed in numerous diseases and brain structures, including cortical beta oscillations measured non-invasively in healthy participants. Thus, it is not unlikely that the new aspects discovered here are also general features of motor processing.

      Revised article

      Discussion, Limitations and future directions

      “Furthermore, we cannot be sure to what extent the present study’s findings relate to PD pathology rather than general motor processing. We suggest that our approach at least approximates healthy brain functioning as patients were on their usual dopaminergic medication. Dopaminergic medication has been demonstrated to normalize power within the STN and globus pallidus internus, as well as STN-globus pallidus internus and STN-cortex coherence (Brown et al., 2001; Hirschmann et al., 2013). Additionally, several of our findings match observations made in other patient populations and healthy participants, who exhibit the same beta power dynamics at movement start and stop (Alegre et al., 2004) that we observed here. Notably, our finding of enhanced cortical involvement in face of uncertainty aligns well with established theories of cognitive processing, given the cortex' prominent role in managing higher cognitive functions (Altamura et al., 2010). Yet, transferring our approach and task to patients with different disorders, e.g. obsessive compulsive disorder, or examining young and healthy participants solely at the cortical level, could contribute to elucidating whether the synchronization dynamics reported here are indeed independent of PD and age.”

      Reviewer #3 (Recommendations for the authors):

      Despite the strengths of the "rhythm" account of cognitive processes, the paper could possibly be improved by making it less skewed to rhythms explaining all of the movement encoding.

      Thank you for this comment - the point is well taken. There is a large body of literature relating neural oscillations to spiking in larger neural populations, which itself is likely the most relevant signal with respect to motor control. In our eyes, it is this link that justifies the rhythm account, i.e. we agree with the reviewer that action potentials are the basis of movement encoding in the brain, not oscillations. Unfortunately, we cannot measure spiking with the method at hand.

      To better integrate this view into the current manuscript, we make the following suggestion for future research in the Limitations and future directions section (p. 21): “Lastly, given the present study’s focus on understanding movement-related rhythms, particularly in the beta range, future research could further explore the role of gamma oscillations in continuous movement and their relation to action potentials in motor areas (Fischer et al., 2020; Igarashi, Isomura, Arai, Harukuni, & Fukai, 2013), which form the basis of movement encoding in the brain.”

      In Figure 5 - is the legend correct? Is it really just a 0.2% change in power only? That would be a very surprisingly small effect size.

      We thank the reviewer for noting this. Indeed, the numbers on the scale quantify relative change (post - pre)/pre and should be multiplied by 100 to obtain %-change. We have adjusted the color bars accordingly.

      The dissociation between the effects of unpredictable cues in coherence versus raw power is interesting and could potentially be directly contrasted further in the discussion (here they are presented separately with separate discussions, but this seems like a pretty important and novel finding as beta coherence and power usually go in the same direction).

      We appreciate the reviewer’s interest in our findings on the predictability of movement instructions. In case of coherence, the difference between pre- and post-event was generally more positive in the unpredictable condition, meaning that suppressions (negative pre-post difference) were diminished whereas increases (positive pre-post difference) were enhanced. With respect to power, we also observed less suppression in the unpredictable condition at movement start. Therefore, the direction of change is in fact the same. We made this clearer in the revised version by adapting the corresponding sections of the abstract, results and discussion (see below).

      The only instance of coherence and power diverging (on a qualitative level) was observed during reversals: here, we noted post-event increases in coherence and post-event decreases in M1 power in the group-average spectra. However, when comparing the pre- and post-event epochs statistically by means of permutation testing, the coherence increase did not reach significance. Hence, we did not highlight this aspect.

      Revised version

      Abstract

      “… Event-related increases of STN-cortex beta coherence were generally stronger in the unpredictable than in the predictable condition. … “

      Results, Effects of predictability on beta power  

      “With respect to the effect of predictability of movement instructions on beta power dynamics (research aim 2), we observed an interaction between movement type and condition (F<sub>cond*mov</sub> (2,14) = 4.206, p<sub>cond*mov</sub> = 0.037, η<sub>p</sub><sup2</sup> = 0.375), such that the beta power suppression at movement start was generally stronger in the predictable (M = -0.170, SD = 0.065) than in the unpredictable (M \= -0.154, SD = 0.070) condition across ROIs (t = -1.888, one-sided p \= 0.037, d = -0.422). We did not observe any modulation of gamma power by the predictability of movement instructions (F<sub>cond</sub> (1,15) = 0.792, p<sub>cond</sub> = 0.388, η<sub>p</sub><sup2</sup> = 0.050, Supplementary File 5).”

      Effects of predictability on STN-cortex coherence

      “With respect to the effect of predictability of movement instructions on beta coherence (research aim 2), we found that the pre-post event differences were generally more positive in the unpredictable condition (main effect of predictability condition; F<sub>cond</sub>(1,15) = 8.684, p<sub>cond</sub> = 0.010, η<sub>p</sub><sup2</sup> = 0.367; Supplementary File 3), meaning that the suppression following movement start was diminished and the increases following stop and reversal were enhanced in the unpredictable condition (Fig. 6a). This effect was most pronounced in the MSMC (Fig. 6b). When comparing regionaverage TFRs between the unpredictable and the predictable condition, we observed a significant difference only for stopping (t<sub>clustersum</sub> = 142.8, p = 0.023), suggesting that the predictability effect was mostly carried by increased beta coherence following stops. When repeating the rmANCOVA for preevent coherence, we did not observe an effect of predictability (F<sub>cond</sub>(1,15) = 0.163, p<sub>cond</sub> = 0.692, η<sub>p</sub><sup2</sup> = 0.011), i.e. the effect was most likely not due to a shift of baseline levels. The increased tendency for upward modulations and decreased tendency for downward modulations rather suggests that the inability to predict the next cue prompted intensified event-related interaction between STN and cortex. STN-cortex gamma coherence was not modulated by predictability (F<sub>cond</sub>(1,15) = 0.005, p<sub>cond</sub> = 0.944, η<sub>p</sub><sup2</sup> = 0.000, Supplementary File 5).”

      Discussion, Beta coherence and beta power are modulated by predictability

      “In the present paradigm, patients were presented with cues that were either temporally predictable or unpredictable. We found that unpredictable movement prompts were associated with stronger upward modulations and weaker downward modulations of STN-cortex beta coherence, likely reflecting the patients adopting a more cautious approach, paying greater attention to instructive cues. Enhanced STN-cortex interactions might thus indicate the recruitment of additional neural resources, which might have allowed patients to maintain the same movement speed in both conditions. […]”

      With respect to power, we observed reduced beta suppression in the unpredictable condition at movement start, consistent with the effect on coherence, likely demonstrating a lower level of motor preparation.

      Given that you have a nice continuous data task here - the turning of the wheel, it might be interesting to cross-correlate the circular position (and separately - velocity) of the turning with the envelope of the beta signal. This would be a nice finding if you could also show that the beta is modulated continuously by the continuous movements. In the natural world, we rarely do a continuous movement with a sudden reversal, or stop, most of the time we are in continuous movement. Look at this might also be a strength of your dataset.

      We could not agree more. In fact, having a continuous behavioral output was a major motivation for choosing this particular task. We are very interested in state space models such as preferential subspace identification (Sani et al., 2021), for example. These models relate continuous brain signals to continuous behavioral target variables and should be of great help for questions such as: do oscillations relate to moment-by-moment adaptations of continuous movement? Which frequency bands and brain areas are important? Is angular position encoded by different brain areas/frequency bands than angular speed? These analyses are in fact ongoing. This project, however, is too large to fit into the current article.

    1. Sportsriots have become commonplace

      Makes me think of the videos I saw of Philadelphia after the Eagles just won the Superbowl. I saw broken windows and trashed streets.

    1. With Blue - uncertain - stumbling Buzz - Between the light - and me - And then the Windows failed - and then I could not see to see -

      Emily Dickinson brings out emotion by challenging our views on death. Most would believe death to be some kind of dramatic transformation into whatever comes after, but Dickinson argues how mundane the process of death might be. A process so ordinary that it could be easily interrupted by the buzz of a fly.

    2. With Blue - uncertain - stumbling Buzz - Between the light - and me - And then the Windows failed - and then I could not see to see -

      Emily Dickinson uses symbols to show death as confusing and unsettling. The "blue – uncertain – stumbling Buzz" of the fly represents the body's breakdown, interrupting the expected peaceful journey to the afterlife. Light may stand for life or understanding, but the fly gets in the way, focusing on the physical side of death instead of a spiritual one. The failing windows show the loss of awareness, and "I could not see to see" emphasizes the finality and mystery of death.

    3. With Blue - uncertain - stumbling Buzz - Between the light - and me - And then the Windows failed - and then I could not see to see -

      she is confused about death and the sound of the buzzing while her eyes are closing

    4. With Blue - uncertain - stumbling Buzz - Between the light - and me - And then the Windows failed - and then I could not see to see -

      In these lines, Dickinson uses metaphors to express emotional confusion and disorientation. The "Blue - uncertain - stumbling Buzz" conveys a sense of disarray and unease, while the "Windows failed" suggests a loss of clarity or vision. The phrase "I could not see to see" highlights a moment of emotional blindness, unable to comprehend or process the situation. The imagery portrays the overwhelming and isolating nature of the speaker's emotions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Thank you for your constructive feedback and recognition of our work. We followed your suggestion and improved the accuracy of the language used to interpret some of our findings. 

      Summary:

      The present study by Mikati et al demonstrates an improved method for in-vivo detection of enkephalin release and studies the impact of stress on the activation of enkephalin neurons and enkephalin release in the nucleus accumbens (NAc). The authors refine their pipeline to measure met and leu enkephalin using liquid chromatography and mass spectrometry. The authors subsequently measured met and leu enkephalin in the NAc during stress induced by handling, and fox urine, in addition to calcium activity of enkephalinergic cells using fiber photometry. The authors conclude that this improved tool for measuring enkephalin reveals experimenter handling stress-induced enkephalin release in the NAc that habituates and is dissociable from the calcium activity of these cells, whose activity doesn't habituate. The authors subsequently show that NAc enkephalin neuron calcium activity does habituate to fox urine exposure, is activated by a novel weigh boat, and that fox urine acutely causes increases in met-enk levels, in some animals, as assessed by microdialysis.

      Strengths:

      A new approach to monitoring two distinct enkephalins and a more robust analytical approach for more sensitive detection of neuropeptides. A pipeline that potentially could help for the detection of other neuropeptides.

      Weaknesses:

      Some of the interpretations are not fully supported by the existing data or would require further testing to draw those conclusions. This can be addressed by appropriately tampering down interpretations and acknowledging other limitations the authors did not cover brought by procedural differences between experiments.

      We have taken time to go through the manuscript ensuring we are more detailed and precise with our interpretations as well as appropriately acknowledging limitations. 

      Reviewer #2 (Public Review):

      Thank you for your constructive and thorough assessment of our work. In our revised manuscript, we adjusted the text to reflect the references you mentioned regarding the methionine oxidation procedure. Additionally, we expanded the methods section to include the key details of the statistical tests and procedures that you outlined. 

      Summary:

      The authors aimed to improve the detection of enkephalins, opioid peptides involved in pain modulation, reward, and stress. They used optogenetics, microdialysis, and mass spectrometry to measure enkephalin release during acute stress in freely moving rodents. Their study provided better detection of enkephalins due to the implementation of previously reported derivatization reaction combined with improved sample collection and offered insights into the dynamics and relationship between Met- and Leu-Enkephalin in the Nucleus Accumbens shell during stress.

      Strengths:

      A strength of this work is the enhanced opioid peptide detection resulting from an improved microdialysis technique coupled with an established derivatization approach and sensitive and quantitative nLC-MS measurements. These improvements allowed basal and stimulated peptide release with higher temporal resolution, lower detection thresholds, and native-state endogenous peptide measurement.

      Weaknesses:

      The draft incorrectly credits itself for the development of an oxidation method for the stabilization of Met- and Leu-Enk peptides. The use of hydrogen peroxide reaction for the oxidation of Met-Enk in various biological samples, including brain regions, has been reported previously, although the protocols may slightly vary. Specifically, the manuscript writes about "a critical discovery in the stabilization of enkephalin detection" and that they have "developed a method of methionine stabilization." Those statements are incorrect and the preceding papers that relied on hydrogen peroxide reaction for oxidation of Met-Enk and HPLC for quantification of oxidized Enk forms should be cited. One suggested example is Finn A, Agren G, Bjellerup P, Vedin I, Lundeberg T. Production and characterization of antibodies for the specific determination of the opioid peptide Met5-Enkephalin-Arg6-Phe7. Scand J Clin Lab Invest. 2004;64(1):49-56. doi: 10.1080/00365510410004119. PMID: 15025428.

      Thank you for highlighting this. It was not our intention to imply that we developed the oxidation method, rather that we were able improve the detection of metenkephalin by oxidation of the methionine without compromising the detection resolution of leu-enkephalin, enabling the simultaneous detection of both peptides. We have addressed this is the manuscript and included the suggested citation. 

      Another suggestion for this draft is to make the method section more comprehensive by adding information on specific tools and parameters used for statistical analysis:

      (1) Need to define "proteomics data" and explain whether calculations were performed on EIC for each m/z corresponding to specific peptides or as a batch processing for all detected peptides, from which only select findings are reported here. What type of data normalization was used, and other relevant details of data handling? Explain how Met- and Leu-Enk were identified from DIA data, and what tools were used.

      Thank you for pointing out this source of confusion. We believe it is because we use a different DIA method than is typically used in other literature. Briefly, we use a DIA method with the targeted inclusion list to ensure MS2 triggering as opposed to using large isolation widths to capture all precursors for fragmentation, as is typically done with MS1 features. For our method, MS2 is triggered based on the 4 selected m/z values (heavy and light versions of Leu and Met-Enkephalin peptides) at specific retention time windows with isolation width of 2 Da; regardless of the intensity of MS1 of the peptides. 

      (2) Simple Linear Regression Analysis: The text mentions that simple linear regression analysis was performed on forward and reverse curves, and line equations were reported, but it lacks details such as the specific variables being regressed (although figures have labels) and any associated statistical parameters (e.g., R-squared values). 

      Additional detail about the linear regression process was added to the methods section, please see lines 614-618. The R squared values are also now shown on the figure. 

      ‘For the forward curves, the regression was applied to the measured concentration of the light standard as the theoretical concentration was increased. For plotting purposes, we show the measured peak area ratios for the light standards in the forward curves. For the reverse curves, the regression was applied to the measured concentration of the heavy standard, as the theoretical concentration was varied.’

      (3) Violin Plots: The proteomics data is represented as violin plots with quartiles and median lines. This visual representation is mentioned, but there is no detail regarding the software/tools used for creating these plots.

      We used Graphpad Prism to create these plots. This detail has been added to the statistical analysis section. See line 630.

      (4) Log Transformation: The text states that the data was log-transformed to reduce skewness, which is a common data preprocessing step. However, it does not specify the base of the logarithm used or any information about the distribution before and after transformation.

      We have added the requested details about the log transformation, and how the data looked before and after, into the statistical analysis section. We followed convention that the use of log is generally base 10 unless otherwise specified as natural log (base 2) or a different base. See lines 622-625

      ‘The data was log10 transformed to reduce the skewness of the dataset caused by the variable range of concentrations measured across experiments/animals. Prior to log transformation, the measurements failed normality testing for a Gaussian distribution. After the log transformation, the data passed normality testing, which provided the rationale for the use of statistical analyses that assume normality.’

      (5) Two-Way ANOVA: Two-way ANOVA was conducted with peptide and treatment as independent variables. This analysis is described, but there is no information regarding the software or statistical tests used, p-values, post-hoc tests, or any results of this analysis.

      Information about the two-way ANOVA analysis has been added to the statistical analysis section. Additionally, more detailed information has been added to the figure legends about the statistical results. Please see lines 625-628.

      ‘Two-way ANOVA testing with peptide (Met-Enk or Leu-Enk) and treatment (buffer or stress for example) as the two independent variables. Post-hoc testing was done using Šídák's multiple comparisons test and the p values for each of these analyses are shown in the figures (Figs. 1F, 2A).’ 

      (6) Paired T-Test: A paired t-test was performed on predator odor proteomic data before and after treatment. This step is mentioned, but specific details like sample sizes, and the hypothesis being tested are not provided.

      The sample size is included in the figure legend to which we have included a reference. We have also included the following text to highlight the purpose of this test. See lines 628-630

      A paired t-test was performed on the predator odor proteomic data before and after odor exposure to test that hypothesis that Met-Enk increases following exposure to predator odor  (Fig. 3F). These analyses were conducted using Graphpad Prism.

      (7) Correlation Analysis: The text mentions a simple linear regression analysis to correlate the levels of Met-Enk and Leu-Enk and reports the slopes. However, details such as correlation coefficients, and p-values are missing.

      We apologize for the use of the word correlation as we think it may have caused some confusion and have adjusted the language accordingly. Since this was a linear regression analysis, there is no correlation coefficient. The slope of the fitted line is reported on the figures to show the fitted values of Met-Enk to Leu-Enk. 

      (8) Fiber Photometry Data: Z-scores were calculated for fiber photometry data, and a reference to a cited source is provided. This section lacks details about the calculation of zscores, and their use in the analysis. 

      These details have been added to the statistical analysis section. See lines 634-637

      ‘For the fiber photometry data, the z-scores were calculated as described in using GuPPy which is an open-source python toolbox for fiber photometry analysis. The z-score equation used in GuPPy is z=(DF/F-(mean of DF/F)/standard deviation of DF/F) where F refers to fluorescence of the GCaMP6s signal.’

      (9) Averaged Plots: Z-scores from individual animals were averaged and represented with SEM. It is briefly described, but more details about the number of animals, the purpose of averaging, and the significance of SEM are needed.

      We have added additional information about the averaging process in the statistical analysis section. See lines 639-643.

      ‘The purpose of the averaged traces is to show the extent of concordance of the response to experimenter handling and predator odor stress among animals with the SEM demonstrating that variability. The heatmaps depict the individual responses of each animal. The heatmaps were plotted using Seaborn in Python and mean traces were plotted using Matplotlib in Python.’

      A more comprehensive and objective interpretation of results could enhance the overall quality of the paper.

      We have taken this opportunity to improve our manuscript following comments from all the reviewers that we hope has resulted in a manuscript with a more objective interpretation of results. 

      Reviewer #3 (Public Review):

      Thank you for your thoughtful review of our work. To clarify some of the points you raised, we revised the manuscript to include more detail on how we distinguish between the oxidized endogenous and standard signal, as well as refine the language concerning the spatial resolution. We also edited the manuscript regarding the concentration measurements. We conducted technical replicates, so we appreciate you raising this point and clarify that in the main text. 

      Summary:

      This important paper describes improvements to the measurement of enkephalins in vivo using microdialysis and LC-MS. The key improvement is the oxidation of met- to prevent having a mix of reduced and oxidized methionine in the sample which makes quantification more difficult. It then shows measurements of enkephalins in the nucleus accumbens in two different stress situations - handling and exposure to predator odor. It also reports the ratio of released met- and leu-enkephalin matching what is expected from the digestion of proenkephalin. Measurements are also made by photometry of Ca2+ changes for the fox odor stressor. Some key takeaways are the reliable measurement of met-enkephalin, the significance of directly measuring peptides as opposed to proxy measurements, and the opening of a new avenue into the research of enkephalins due to stress based on these direct measurements.

      Strengths:

      -Improved methods for measurement of enkephalins in vivo.

      -Compelling examples of using this method.

      -Opening a new area of looking at stress responses through the lens of enkephalin concentrations.

      Weaknesses:

      (1) It is not clear if oxidized met-enk is endogenous or not and this method eliminates being able to discern that.

      We clarified our wording in the text copied below to provide an explanation on how we distinguish between the two. Even after oxidation, the standard signal has a higher m/z ratio due to the presence of the Carbon and Nitrogen isotopes as described in the Chemicals section of the methods ‘For Met Enkephalin, a fully labeled L-Phenylalanine (<sup>13</sup>C<sub>9</sub>, <sup>15</sup>N) was added (YGGFM). The resulting mass shift between the endogenous (light) and heavy isotope-labeled peptide are 7Da and 10Da, respectively.’, so they can still be differentiated from the endogenous signal. We have clarified the language in the results section. See lines 82-87. 

      ‘After each sample collection, we add a consistent known concentration of isotopically labeled internal standard of Met-Enk and Leu-Enk of 40 amol/sample to the collected ISF for the accurate identification and quantification of endogenous peptide. These internal standards have a different mass/charge (m/z) ratio than endogenous Met- and Leu-Enk. Thus, we can identify true endogenous signal for Met-Enk and Leu-Enk (Suppl Fig. 1A,C) versus noise, interfering signals, and standard signal (Suppl. Fig. 1B,D).’

      (2) It is not clear if the spatial resolution is really better as claimed since other probes of similar dimensions have been used.

      Apologies for any confusion here. To clarify we primarily state that our approach improves temporal resolution and in a few cases refer to improved spatiotemporal resolution, which we believe we show. The dimensions of the microdialysis probe used in these experiments allow us to target the nucleus accumbens shell and as well as being smaller – especially at the membrane level - than a fiber photometry probe. 

      (3) Claims of having the first concentration measurement are not quite accurate.

      Thank you for your feedback. To clarify, we do not claim that we have the first concentration measurements, rather we are the first to quantify the ratio of Met-Enk to Leu-Enk in vivo in freely behaving animals in the NAcSh. 

      (4) Without a report of technical replicates, the reliability of the method is not as wellevaluated as might be expected.

      We have added these details in the methods section, please see lines 521-530. 

      ‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      • The authors should provide appropriate citations of a study that has validated the Enkephalin-Cre mouse line in the nucleus accumbens or provide verification experiments if they have any available.

      Thank you for your comment. We have added a reference validating the Enk-Cre mouse line in the nucleus accumbens to the methods section and is copied here. 

      D.C. Castro, C.S. Oswell, E.T. Zhang, C.E. Pedersen, S.C. Piantadosi, M.A. Rossi, A.C. Hunker, A. Guglin, J.A. Morón, L.S. Zweifel, G.D. Stuber, M.R. Bruchas, An endogenous opioid circuit determines state-dependent reward consumption, Nature 2021 598:7882 598 (2021) 646–651. https://doi.org/10.1038/s41586-02104013-0.

      • Better definition of the labels y1,y2,b3 in Figures 1 and S1 would be useful. I may have missed it but it wasn't described in methods, results, or legends.

      Thank you for this comment. We have added this information to Fig.1 legend ‘Y1, y2, b3 refer to the different elution fragments resulting from Met-Enk during LC-MS.

      • It is interesting that the ratio of KCl-evoked release is what changes differentially for Met- vs Leu. Leu enk increases to the range of met-enk. There is non-detectable or approaching being non-detectable leu-enk (below the 40 amol / sample limit of quantification) in most of the subjects that become apparent and approach basal levels of met-enkephalin. This suggests that the K+ evoked response may be more pronounced for leu-enk. This is something that should be considered for further analysis and should be discussed.

      Thank you for this astute observation, and you make a great point. We have added some discussion of this finding in the results and discussion sections see lines 111112 and lines 253-257. 

      ‘Interestingly, Leu-Enk showed a greater fold change compared to baseline than did Met-Enk with the fold changes being 28 and 7 respectively based on the data in Fig.1F.’

      ‘We also noted that Leu-Enk showed a greater fold increase relative to baseline after depolarization with high K+ buffer as compared to Met-Enk. This may be due to increased Leu-Enk packaging in dense core vesicles compared to Met-Enk or due to the fact that there are two distinct precursor sources for Leu-Enk, namely both proenkephalin and prodynorphin while Met-Enk is mostly cleaved from proenkephalin (see Table 1 [48]).’

      • For example in 2E, it would be helpful to label in the graph axis what samples correspond to the manipulation and also in the text provide the reader with the sample numbers. The authors interpret the relationship between the last two samples of baseline and posthandling stress as the following in the figure legend "the concentration released in later samples is affected; such influence suggests that there is regulation of the maximum amount of peptide to be released in NAcSh. E. The negative correlation in panel d is reversed by using a high K+ buffer to evoke Met-Enk release, suggesting that the limited release observed in D is due to modulation of peptide release rather than depletion of reserves." However, the correlations are similar between 2D and E and it appears that two mice are mediating the difference between the two groups. The appropriate statistical analysis would be to compare the regressions of the two groups. Statistics for the high K+ (and all other graphs where appropriate) need to be reported, including the r2 and p-value.

      Thank you for your constructive critique. To elucidate the effect of high K+, we have plotted the regression line and reported the slope for Fig. 2E. Notably, the slope is reduced by a factor of 2 and appears to be driven by a large subset of the animals. The statistics for the high K+ graph are shown on the figure (Fig 1F) which test the hypothesis of whether high K+ leads to the release of Leu-Enk and Met-Enk respectively compared to baseline with aCSF. We have added the test statistics to the figure legend for additional clarity. Fig. 1G has no statistics because it is only there to elucidate the ratio between Met-Enk and Leu-Enk in the same samples. We did not test any hypotheses related to whether there are differences between their levels as that is not relevant to our question. The correlation on the same data is depicted in Fig. 1H, and we have added the R<sup>2</sup> value per your request. 

      • The interpretation that handling stress induces enkephalin release from microdialysis experiments is also confounded by other factors. For instance, from the methods, it appears that mice were connected and sample collection started 30 min after surgery, therefore recovery from anesthesia is also a confounding variable, among other technical aspects, such as equilibration of the interstitial fluid to the aCSF running through the probe that is acting as a transmitter and extracellular molecule "sink". Did the authors try to handle the mice post hookup similar to what was done with photometry to have a more direct comparison to photometry experiments? This procedural difference, recording from recently surgerized animals (microdialysis) vs well-recovered animals with photometry should be mentioned in addition to the other caveats the authors mention.

      Thank you for your comment. We are aware of this technical limitation, and it is largely why we sought to conduct the fiber photometry experiments to get at the same question. As you requested, we have included additional language in the discussion to acknowledge this limitation and how we chose to address it by measuring calcium activity in the enkephalinergic neurons, which would presumably be the same cell population whose release we are quantifying using microdialysis. See lines 262-273.  

      ‘Our findings showed a robust increase in peptide release at the beginning of experiments, which we interpreted as due to experimenter handling stress that directly precedes microdialysis collections. However, there are other technical limitations to consider such as the fact that we were collecting samples from mice that were recently operated on. Another consideration is that the circulation of aCSF through the probe may cause a sudden shift in oncotic and hydrostatic forces, leading to increased peptide release to the extracellular space. As such, we wanted to examine our findings using a different technique, so we chose to record calcium activity from enkephalinergic neurons - the same cell population leading to peptide release. Using fiber photometry, we showed that enkephalinergic neurons are activated by stress exposure, both experimenter handling and fox odor, thereby adding more evidence to suggest that enkephalinergic neurons are activated by stress exposure which could explain the heightened peptide levels at the beginning of microdialysis experiments.’

      • The authors should provide more details on handling stress manipulation during photometry. For photometry what was the duration of the handling bout, what was the interval between handling events, and can the authors provide a description of what handling entailed? Were mice habituated to handling days before doing photometry recording experiments?

      Thank you for your suggestion. We have addressed all of your points in the methods section. See lines 564-570. 

      ‘The handling bout which mimicked traditional scruffing lasted about 3-5 seconds. The mouse was then let go and the handling was repeated another two times in a single session with a minimum of 1-2 minutes between handling bouts. Mice were habituated to this manipulation by being attached to the fiber photometry rig, for 3-5 consecutive days prior to the experimental recording. Additionally, the same maneuver was employed when attaching/detaching the fiber photometry cord, so the mice were subjected to the same process several times.’

      • For the novel weigh boat experiments, the authors should explicitly state when these experiments were done in relation to the fox urine, was it a different session or the same session? Were they the same animals? Statements like the following (line 251) imply it was done in the same animals in the same session but it should be clarified in the methods "We also showed using fiber photometry that the novelty of the introduction of a foreign object to the cage, before adding fox odor, was sufficient to activate enkephalinergic neurons."

      As shown in supplementary figure 4, individual animal data is shown for both water and fox urine exposure (overlaid) to depict whether there were differences in their responses to each manipulation – in the same animal. And yes, you are correct, the animals were first exposed to water 3 times in the recording session and then exposed to fox urine 3 times in the same session. We have added that to the methods section describing in vivo fiber photometry. See lines 575-576.  

      • Statistical testing would be needed to affirm the conclusions the authors draw from the fox urine and novel weigh boat experiments. For example, it shows stats that the response attenuates, that it is not different between fox urine and novel (it looks like the response is stronger to the fox urine when looking at the individual animals), etc. These data look clear but stats are formally needed. Formal statistics are also missing in other parts of the manuscript where conclusions are drawn from the data but direct statistical comparisons are not included (e.g. Fig 2.G-I).

      The photometry data is shown as z-scores which is a formal statistical analysis. ANOVA would be inappropriate to run to compare z-scores. We understand that this is erroneously done in fiber photometry literature, however, it remains incorrect. The z-scores alone provide all the information needed about the deviation from baseline. We understand that this is not immediately clear to readers, and we thank you for allowing us to explain why this is the case. We have added test statistics to figure legends where hypothesis testing was done and p-values were reported. 

      • Did the authors try to present the animals with repeated fox urine exposure to see if this habituates like the photometry?

      No, we did not do that experiment due to the constrained timing within which we had to run our microdialysis/LC-MS timeline, but it is a great point for future exploration. 

      • It would be useful to present the time course of the odor experiment for the microdialysis experiment.

      The timeline is shown in Fig.1a and Fig.3e. To reiterate, each sample is 13 minutes long.

      • Can the authors determine if differences in behavior (e.g. excessive avoidance in animals with with one type of response) or microdialysis probe location dictate whether animals fall into categories of increased release, no release, or no-detection? From the breakdown, it looks like it is almost equally split into three parts but the authors' descriptions of this split are somewhat misleading (line 210). " The response to predator odor varies appreciably: although most animals show increased Met-Enk release after fox odor exposure, some show continued release with no elevation in Met-Enk levels, and a minority show no detectable release".

      Thank you for your constructive feedback. We do not believe the difference in behavior is correlated with probe placement. The hit map can be found in suppl. Fig 3 and shows that all mice included in the manuscript had probes in the NAcSh. We purposely did not distinguish between dorsal and ventral because of our 1 mm membrane would make it hard to presume exclusive sampling from one subregion. That is a great point though, and we have thought about it extensively for future studies. We have edited the language to reflect the almost even split of responses for Met-Enk and appreciate you pointing that out. 

      • Overall, given the inconsistencies in experimental design and overall caveats associated, I think the authors are unable to draw reasonable conclusions from the repeated stressor experiments and something they should either consider is not trying to draw strong conclusions from these observations or perform additional experiments that provide the grounds to derive those conclusions.

      We have included additional language on the caveats of our study, and our use of a dual approach using fiber photometry and microdialysis was largely driven by a

      desire to offer additional support of our conclusions. We expected pushback about our conclusions, so we wanted to offer a secondary analysis using a different technique to test our hypothesis. To be honest the tone of this comment and content is not particularly constructive (especially for trainees) nor does it offer a space to realistically address anything. This work took multiple years to optimize, it was led by a graduate student, and required a multidisciplinary team. As highlighted, we believe it offers an important contribution to the literature and pushes the field of peptide detection forward.  

      Reviewer #2 (Recommendations For The Authors):

      A more comprehensive and objective interpretation of results could enhance the overall quality of the paper. The manuscript contains statements like "we are the first to confirm," which can be challenging to substantiate and may not significantly enhance the paper. It's essential to ensure that novelty statements are well-founded. For example, the release of enkephalins from other brain regions after stress exposure is well-documented but not addressed in the paper. Similarly, the role of the NA shell in stress has been extensively studied but lacks coverage in this manuscript.

      We have edited the language to reflect your feedback. We have also included relevant literature expanding on the demonstrated roles of enkephalins in the literature. We would like to note that most studies have focused on chronic stress, and we were particularly interested in acute stress. See lines 129-134.

      ‘These studies have included regions such as the locus coeruleus, the ventral medulla, the basolateral nucleus of the amygdala, and the nucleus accumbens core and shell. Studies using global knockout of enkephalins have shown varying responses to chronic stress interventions where male knockout mice showed resistance to chronic mild stress in one study, while another study showed that enkephalin-knockout mice showed delayed termination of corticosteroid release. [33,34]’ 

      Finally, not a weakness but a clarification suggestion: the method description mentions the use of 1% FA in the sample reconstitution solution and LC solvents, which is an unusually high concentration of acid. If this concentration is intentional for maintaining the peptides' oxidation state, it would be beneficial to mention this in the text to assist readers who might want to replicate the method.

      This is correct and has been clarified in the methods section

      Reviewer #3 (Recommendations For The Authors):

      -The Abstract should state the critical improvements that are made. Also, quantify the improvements in spatiotemporal resolution.

      Thank you for your comment. We have edited the abstract to reflect this. 

      - The use of "amol/sample" as concentration is less informative than an SI units (e.g., pM concentration) and should be changed. Especially since the volume used was the same for in vivo sampling experiments.

      Thank you for your comment. We chose to report amol/sample because we are measuring such a small concentration and wanted to account for any slight errors in volume that can make drastic differences on reported concentrations especially since samples are dried and resuspended.  

      -Please check this sentence: "After each collection, the samples were spiked with 2 µL of 12.5 fM isotopically labeled Met-Enkephalin and Leu-Enkephalin" This dilution would yield a concentration of ~2 fM. In a 12 uL sample, that would be ~0.02 amol, well below the detection limit. (note that fM would femtomolar concentration and fmol would be femtomoles added).

      -"liquid chromatography/mass spectrometry (LC-MS) [9-12]"... Reference 9 is a RIA analysis paper, not LC-MS as stated.

      Thank you for catching these. We have corrected the unit and citation. 

      -Given that improvements in temporal resolution are claimed, the lack of time course data with a time axis is surprising. Rather, data for baseline and during treatment appear to be combined in different plots. Time course plots of individuals and group averages would be informative.

      Due to the expected variability between individual animal time course data, where for example, we measure detectable levels in one sample followed by no detection, it was very difficult to combine data across time. Therefore, to maximize data inclusion from all animals that showed baseline measurements and responses to individual manipulations, we opted to report snapshot data. Our improvement in temporal resolution refers to the duration of each sample rather than continuous sampling, so those two are unrelated. Thank you for your feedback and allowing us to clarify this.

      - I do not understand this claim "We use custom-made microdialysis probes, intentionally modified so they are similar in size to commonly used fiber photometry probes to avoid extensive tissue damage caused by traditional microdialysis probes (Fig. 1B)." The probes used are 320 um OD and 1 mm long. This is not an uncommon size of microdialysis probes and indeed many are smaller, so is their probe really causing less damage than traditional probes?

      Thank you for your comment. We are only trying to make the point that the tissue damage from these probes is comparable to commonly used fiber photometry probes. We only point that out because tissue damage is used as a point to dissuade the usage of microdialysis in some literature, and we just wanted to disambiguate that. We have clarified the statement you pointed out.  

      -The oxidation procedure is a good idea, as mentioned above. It would be interesting to compare met-enk with and without the oxidation procedure to see how much it affects the result (I would not say this is necessary though). It is not uncommon to add antioxidants to avoid losses like this. Also, it should be acknowledged that the treatment does prevent the detection of any in vivo oxidation, perhaps that is important in met-enk metabolism?

      The comparison between oxidized and unoxidized Met-Enk detection is in figure 1C. 

      -It would be a best practice to report the standard deviation of signal for technical replicates (say near in vivo concentrations) of standards and repeated analysis of a dialysate sample to be able to understand the variability associated with this method. Similarly, an averaged basal concentration from all rats.

      Thank you for your comment. We have included a table showing example quality control standard injections from 4 randomly selected experiments included in the manuscript that were run before and after each experiment and descriptive statistics associated with these technical replicates. We also added some detail to the methods section to describe how quality control is done. See lines 521-530. 

      ‘Each sample was run in two technical replicates and the peak area ratio was averaged before concentration calculations of the peptides were conducted. Several quality control steps were conducted prior to running the in vivo samples. 1) Two technical replicates of a known concentration were injected and analyzed – an example table from 4 random experiments included in this manuscript is shown below. 2) The buffers used on the day of the experiment (aCSF and high K+ buffer) were also tested for any contaminating Met-Enk or Leu-Enk signals by injecting two technical replicates for each buffer. Once these two criteria were met, the experiment was analyzed through the system. If either step failed, which happened a few times, the samples were frozen and the machines were cleaned and restarted until the quality control measures were met.’

      EDITORS NOTE

      Should you choose to revise your manuscript, please include full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.

      Thank you for your suggestion. We have included more detail about statistical analysis in the figure legends per this comment and reviewer comments.

    1. Securing the surroundings of your gadgets is as important as implementing digital security protocols. Devices such as cameras are gateways that house confidential information that is susceptible to unauthorized access or tampering. Approaches Use protective casings that are resistant to tampering. Add security features to enclosures to help prevent threats and deter unauthorized access or tampering with your devices. Place devices in easy-to-reach spots can help lower the chances of tampering or access control breaches. For example, mount surveillance cameras up high. Make sure entrances are inside areas away from windows and doors.

      Listing multiple security strategies, and noting that the article intentionally provides several solutions rather than a single quick fix, aligns with my opinion that a multi-step security approach is required.

    1. Reviewer #1 (Public review):

      In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.

      Major points:

      (1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.

      (2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.

      (3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.

      (4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.

      Minor point:

      (1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.

      We thank the reviewer for the encouraging comments, and will fully address the reviewer’s concerns as detailed below.

      Major points:

      (1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.

      We agree with the reviewer’s concern that the present manuscript lacks the evidence for localization of GPR55 at PC axon terminals. Honestly, our previous attempt to immune-label GPR55 did not work well. Now, we realize that different antibodies are commercially available, and are going to test them. Hopefully, in the revised manuscript, we will demonstrate immunocytochemical images showing GPR55 at terminals of PCs.

      (2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.

      We appreciate the reviewer for pointing out the essential issue regarding the specificity of activation of GPR55 in our study. Regarding the direct manipulation of GPR55, such as genetic deletion, we will try acute knock-down of its expression, considering the possibility of compensation which sometimes occur when the complete knock-out is performed. In addition, according to the reviewer’s suggestion, we will examine whether the effects of LPI and AM251 occlude each other, and also perform control experiments showing the lack of CB1R involvement.

      (3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.

      As suggested, the timing and duration of drug application will be indicated together with the time course of changes of IPSC amplitudes. This change will make things much clearer. Thank you for the suggestion.

      (4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.

      We are really sorry for missing this important study in discussion and citation. In the revised version, of course, we will discuss their findings and our data.

      Minor point:

      (1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.

      We are sorry for insufficient explanation about the LPI used in our study. We used LPI derived from soy (Merck, catalog #L7635) that was estimated to contain 58% C16:0 and 42% C18:0 or C18:2 LPI. This information will be added to the Materials and Methods in the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      This paper investigates the mode of action of GPR55, a relatively understudied type of cannabinoid receptor, in presynaptic terminals of Purkinje cells. The authors use demanding techniques of patch clamp recording of the terminals, sometimes coupled with another recording of the postsynaptic cell. They find a lower release probability of synaptic vesicles after activation of GPR55 receptors, while presynaptic voltage-dependent calcium currents are unaffected. They propose that the size of a specific pool of synaptic vesicles supplying release sites is decreased upon activation of GPR55 receptors.

      Strengths:

      The paper uses cutting-edge techniques to shed light on a little-studied, potentially important type of cannabinoid receptor. The results are clearly presented, and the conclusions are for the most part sound.

      We are really happy to hear the encouraging comments from the reviewer.

      Weaknesses:

      The nature of the vesicular pool that is modified following activation of GPR55 is not definitively characterized.

      During revision, we will perform further analysis and additional experiments to obtain deeper insights into the vesicle pools affected by GPR55 as much as possible.

      Reviewer #3 (Public review):

      Summary:

      Inoshita and Kawaguchi investigated the effects of GPR55 activation on synaptic transmission in vitro. To address this question, they performed direct patch-clamp recordings from axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis utilizing synapto-pHluorin. They found that exogenous activation of GPR55 suppresses GABA release at Purkinje cell to deep cerebellar nuclei (PC-DCN) synapses by reducing the readily releasable pool (RRP) of vesicles. This mechanism may also operate at other synapses.

      Strengths:

      The main strength of this study lies in combining patch-clamp recordings from axon terminals with imaging of presynaptic vesicular exocytosis to reveal a novel mechanism by which activation of GPR55 suppresses inhibitory synaptic strength. The results strongly suggest that GPR55 activation reduces the RRP size without altering presynaptic calcium influx.

      We thank the reviewer for the positive evaluation on our conclusions.

      Weaknesses:

      The study relies on the exogenous application of GPR55 agonists. It remains unclear whether endogenous ligands released due to physiological or pathological activities would have similar effects. There is no information regarding the time course of the agonist-induced suppression. There is also little evidence that GPR55 is expressed in Purkinje cells. This study would benefit from using GPR55 knockout (KO) mice. The downstream mechanism by which GPR55 mediates the suppression of GABA release remains unknown.

      We agree with the reviewer in all respects suggested as weaknesses. Most issues will be made much clearer by the additional experiments and analysis described above to respond to respective issues raised by other reviewers. The situation of endogenous ligands for GPR55 causing the synaptic depression and its downstream mechanism are very important issues, and we are going to discuss these points in the revised manuscript, and like to work on these in the future study.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The paper proposes that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' can significantly impact the validity of neural measures of consciousness. The authors found that conservative criteria, which require stronger evidence to classify a stimulus as 'seen,' tend to inflate effect sizes in neural measures, making conscious processing appear more pronounced than it is. Conversely, liberal criteria, which require less evidence, reduce these effect sizes, potentially underestimating conscious processing. This variability in effect sizes due to criterion placement can lead to misleading conclusions about the nature of conscious and unconscious processing.

      Furthermore, the study highlights that the Perceptual Awareness Scale (PAS), a commonly used tool in consciousness research, does not effectively mitigate these criterion-related confounds. This means that even with PAS, the validity of neural measures can still be compromised by how criteria are set. The authors emphasize the need for careful consideration and standardization of criterion placement in experimental designs to ensure that neural measures accurately reflect the underlying cognitive processes. By addressing this issue, the paper aims to improve the reliability and validity of findings in the field of consciousness research.

      Strengths:

      (1) This research provides a fresh perspective on how criterion placement can significantly impact the validity of neural measures in consciousness research.

      (2) The study employs robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence.

      (3) By highlighting the limitations of the PAS and the impact of criterion placement, the study offers practical recommendations for improving experimental designs in consciousness research.

      Weaknesses:

      The primary focused criterion of PAS is a commonly used tool, but there are other measures of consciousness that were not evaluated, which might also be subject to similar or different criterion limitations. A simulation could applied to these metrics to show how generalizable the conclusion of the study is.

      We would like to thank reviewer 1 for their positive words and for taking the time to evaluate our manuscript. We agree that it would be important to gauge generalization to other metrics of consciousness. Note however, that the most commonly used alternative methods are postdecision wagering and confidence, both of which are known to behave quite similarly to the PAS (Sandberg, Timmermans , Overgaard & Cleeremans, 2010). Indeed, we have confirmed in other work that confidence is also sensitive to criterion shifts (see https://osf.io/preprints/psyarxiv/xa4fj). Although it has been claimed that confidence-derived aggregate metrics like meta-d’ or metacognitive efficiency may overcome criterion shifts, it would require empirical data rather than simulation to settle whether this is true or not (also see the discussion in https://osf.io/preprints/psyarxiv/xa4fj). Furthermore, out of these metrics, the PAS seems to be the preferred one amongst consciouness researchers (see figure 4 in Francken, Beerendonk, Molenaar, Fahrenfort, Kiverstein, Seth, Gaal S van, 2022; as well as https://osf.io/preprints/psyarxiv/bkxzh). Thus, given the fact that other metrics are either expected to behave in similar ways and/or because it would require more empirical work to determine along which dimension(s) criterion shifts would operate in alternative metrics, we see no clear path to implement the suggested simulations. We anticipate that aiming to do this would require a considerable amount of additional work, figuring out many things which we believe would better suit a future project. We would of course be open to doing this if the reviewer would have more specific suggestions for how to go about the proposed simulations.

      Reviewer #2 (Public review):

      Summary:

      The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.

      Strengths:

      When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated the subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.

      Weaknesses:

      (1) The response criterion plays a crucial role in influencing neural decoding because a subject's report may not always align with the actual stimulus presented. This discrepancy can occur in cases of false alarms, where a subject reports seeing a target that was not actually there, or in cases where a target is present but not reported. Some may argue that only using data from consistent trials (those with correct responses) would not be affected by the response criterion. However, the authors' analysis suggests that a conservative response criterion not only reduces false alarms but also impacts hit rates. It is important for the authors to further investigate how the response criterion affects neural decoding even when considering only correct trials.

      We would like to thank reviewer 2 for taking the time to evaluate our manuscript. We appreciate the suggestion to investigate neural decoding on only correct trials. What we in fact did is consider target trials that are 'correct' (hits = seen target present trials) and 'incorrect' (misses = unseen target present trials) separately, see figure 4A and figure 4B. This shows that the response criterion also affects the neural measure of consciousness when only considering correct target present trials. Note however, that one cannot decode 'unseen' (target present) trials if one only aims to decode 'correct' trials, because those are all incorrect by definition. We did not analyze false alarms (these would be the 'seen' trials on the noise distribution of Figure 1A), as there were not enough trials of those, especially in the conservative condition (see Figure 2C and 2D), making comparisons between conservative and liberal impossible. However, the predictions for false alarms are pretty straightforward, and follow directly from the framework in Figure 1.

      (2) The author has utilized decoding target vs. nontarget as the neural measures of unconscious and/or conscious processing. However, it is important to note that this is just one of the many neural measures used in the field. There are an increasing number of studies that focus on decoding the conscious content, such as target location or target category. If the author were to include results on decoding target orientation and how it may be influenced by response criterion, the field would greatly benefit from this paper.

      We thank the reviewer for the suggestion to decode orientation of the target. In our experiments, the target itself does not have an orientation, but the texture of which it is composed does. We used four orientations, which were balanced out within and across conditions such that presence-absence decoding is never driven by orientation, but rather by texture based figure-ground segregation (for similar logic, see for example Fahrenfort et al, 2007; 2008 etc). There are a couple of things to consider when wanting to apply a decoding analysis on the orientation of these textures:

      (1) Our behavioral task was only on the presence or absence of the target, not on the orientation of the textures. This makes it impossible to draw any conclusions about the visibility of the orientation of the textures. Put differently: based on behavior there is no way of identifying seen or unseen orientations, correctly or incorrectly identified orientations etc. For examply, it is easy to envision that an observer detects a target without knowing the orientation that defines it, or vice versa a situation in which an observer does not detect the target while still being aware of the orientation of a texture in the image (either of the figure, or of the background). The fact that we have no behavioral response to the orientation of the textures severely limits the usefulness of a hypothetical decoding effect on these orientations, as such results would be uninterpretable with respect to the relevant dimension in this experiment, which is visibility.

      (2) This problem is further excarbated by the fact that the orientation of the background is always orthogonal to the orientation of the target. Therefore, one would not only be decoding the orientation of the texture that constitutes the target itself, but also the texture that constitutes the background. Given that we also have no behavioral metric of how/whether the orientation of the background is perceived, it is similarly unclear how one would interpret any observed effect.

      (3) Finally, it is important to note that – even when categorization/content is sometimes used as an auxiliary measure in consciousness research (often as a way to assay objective performance) - consciousness is most commonly conceptualized on the presence-absence dimension. A clear illustration of this is the concept of blindsight. Blindsight is the ability of observers to discriminate stimuli (i.e. identify content) without being able to detect them. Blindsight is often considered the bedrock of the cognitive neuroscience of consciousness as it acts as proof that one can dissociate between unconscious processing (the categorization of a stimulus, i.e. the content) and conscious processing of that stimulus (i.e. the ability to detect it).

      Given the above, we do not see how the suggested analysis could contribute to the conclusions that the manuscript already establishes. We hope that – given the above - the reviewer agrees with this assessment.

      Reviewer #3 (Public review):

      Summary:

      Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participants report on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.

      Strengths and Weaknesses:

      One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.

      A potential area for improvement in this study is the use of single time-points from peak decoding accuracy to generate current source density topography maps. While we recognize that the decoding analysis employed here differs from traditional ERP approaches, the robustness of the findings could be enhanced by exploring current source density over relevant time windows. Event-related peaks, both in terms of timing and amplitude, can sometimes be influenced by noise or variability in trial-averaged EEG data, and a time-window analysis might provide a more comprehensive and stable representation of the underlying neural dynamics.

      We thank reviewer 3 for their positive words and for taking the time to evaluate our manuscript. If we understand the reviewer correctly, he/she suggests that the signal-to-noise ratio could be improved by averaging over time windows rather than taking the values at singular peaks in time. Before addressing this suggestion, we would like to point out that we plotted the relevant effects across time in Supplementary Figure S1A and S1B. These show that the observed effects were not somehow limited in time, i.e. only occuring around the peaks, but that they consistenly occured throughout the time course of the trial. In line with this observation one might argue that the results could be improved further by averaging across windows of interest rather than taking the peak moments alone, as the reviewer suggests. Although this might be true, there are many analysis choices that one can make, each of which could have a positive (or negative) effect on the signal to noise ratio. For example, when taking a window of interest, one is faced with a new choice to make, this time regarding the number of consecutive samples to average across (i.e. the size of the window), etc. More generally there is a long list of choices that may affect the precise outcome of analyses, either positively or negatively. Having analyzed the data in one way, the problem with adding new analysis approaches is that there is no objective criterion for deciding which analysis would be ‘best’, other than looking at the outcome of the statistical analyses themselves. Doing this would constitute an explorative double-dipping-like approach to analyzing the results, which – aside from potentially increasing the signal to noise ratio – is likely to also result in an increase of the type I error rate. In the past, when the first author of this manuscript has attempted to minimize the number of statistical tests, he has lowered the number of EEG time points by simply taking the peaks (for example see https://doi.org/10.1073/pnas.1617268114), and that is the approach that was taken here as well. Given the above, we prefer not to further ‘try out’ additional analytical approaches on this dataset, simply to improve the results. We hope the reviewer sympathizes with our position that it is methodologically most sound to stick to the analyses we have already performed and reported, without further exploration.

      It is helpful that the authors show the standard error of the mean for the classifier performance over time. A similar indication of a measure of variance in other figures could improve clarity and transparency.

      That said, the paper appears solid regarding technical issues overall. The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.

      We thank the reviewer for this suggestion. Note that we already have a measure of variance in the other figures too, namely showing the connected data points of individual participants. Indvidual data points as a visualization of variance is preferred by many journals (e.g., see https://www.nature.com/documents/cr-gta.pdf), and also shows the spread of relevant differences when paired points are connected. For example, in Figure 2, 3 and 4, the relevant difference is between the liberal and conservative condition. When wanting to show the spread of the differences between these conditions, one option would be to first subtract the two measures in a pairwise fashion (e.g., liberal-conservative), and then plot the spread of those differences using some metric (e.g. standard error/CI of the mean difference). However, this has the disadvantage of no longer separately showing the raw scores on the conditions that are being compared. Showing conditions separately provides clarity to the reader about what is being compared to what. The most common approach to visualizing the variance of the relevant difference in such cases, is to plot the connected individual data points of all participants in the same plot. The uniformity of the slope of these lines in such a visualization provides direct insight into the spread of the relevant difference. Plotting the standard error of the mean on the raw scores of the conditions in these plots would not help, because this would not visualize the spread of the relevant difference (liberal-conservative). We therefore opted in the manuscript to show the mean scores on the conditions that we compare, while also showing the connected raw data points of individual participants in the same plot. One might argue that we should then use that same visualization in figure 3A, but note that this figure is merely intended to identify the peaks, i.e. it does not compare liberal to conservative. Furthermore, plotting the decoding time lines of individual participants would greatly diminish the clarity of this figure. Given our explanation, we hope the reviewer agrees with the approach that we chose, although we are of course open to modifying the figures if the reviewer has a suggestion for doing so while taking into account the points we raise here in our response.

      Impact of the Work:

      This study effectively demonstrates a phenomenon that has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors could further elaborate on the results of the PAS to provide a clearer insight into the impact of response criteria, which is notably more complex than in other experiments. Specifically, the results demonstrate that conservative response criterion condition displays a considerably higher sensitivity compared to those with a liberal response criterion. It would be interesting to explore whether this shift in sensitivity suggests a correlation between changes in response criteria and conscious experiences, and how the interaction between sensitivity and response criteria can affect the neural measure of consciousness.

      We thank the reviewer for this suggestion. Note that the change in sensitivity that we observed is minor compared to the change we observed in response criterion (hedges g criterion in exp 2 = 2.02, compared to hedges g sensitivity/d’ in exp 2 = 0.42). However, we do investigate the effect of sensitivity (disregarding response criterion) on decoding accuracy. To this end we devised Figure 3C (for the full decoding time course see Supplementary Figure S1B). These figures show that the small behavioral sensitivity effects observed in both experiments (hedges g sensitivity in exp 1 = 0.30, exp 2 = 0.42) did not translate into significant decoding differences between conservative and liberal in either experiment. This comes as no surprise given the small corresponding behavioral effects. Note that small sensitivity differences between liberal and conservative conditions are commonplace, plausibly driven by the fact that being liberal also involves being more noisy in one’s response tendencies (i.e. sometimes randomly indicating presence). Further, the reviewer suggests that we might correlate changes in response criteria to changes in conscious experience. The only relevant metric of conscious experience for which we have data in this manuscript is the Perceptual Awareness Scale (PAS), so we assume the reviewer asks for a correlation between experimentally induced changes in response criterion with the equivalent changes in d’. To this end we computed the difference in the PAS-based d’ metric between conservative and liberal, as well as the difference in the PAS-based criterion metric between conservative and liberal, and correlated these across subjects (N=26) using a Spearman rank correlation. The result shows that these metrics do not correlate r(24)=0.04, p=0.85. Note however that small-N correlations like these are only somewhat reliable for large effect sizes. An N of 26 and a mere power of 80% requires an effect size of at least r=0.5 to be detectable, so even if a correlation were to exist we may not have had enough power to detect it. Due to these caveats we opted to not report this null-correlation in the manuscript, but we are of course willing to do so if the reviewer and/or editor disagrees with this assessment.

  6. Jan 2025
    1. Reviewer #3 (Public review):

      Summary

      The manuscript investigates the role of norepinephrine (NE) release in the rodent hippocampus during event boundaries, such as transitions between spatial contexts and the introduction of novel objects. It also explores how NE release is altered by experience and how novelty drives the amplitude and decay times of extracellular NE. By utilizing the GRABNE sensor for sub-second resolution measurement of NE, the authors demonstrate that NE release is driven primarily by the time elapsed since an event boundary and is independent of behaviors like movement or reward. The study further explores how hippocampal neural representations are altered over time, showing that these representations stabilize shortly after event transitions, potentially linking NE release to episodic memory encoding.

      Strengths

      Overall, the work provides novel insights into the interplay between NE signaling and hippocampal activity and presents an intriguing hypothesis on how NE release may help push hippocampal activity into unique attractor states to encode novel experiences. The experiments are well-controlled, and the analysis is well-presented, with a detailed and engaging discussion that points towards several new and exciting research directions. The use of several behavioral paradigms to demonstrate the strongest predictor of NE release is a strength, as well as the regression analysis to disambiguate the contribution of other correlated variables. The suggestion that NE does not select ensembles for subsequent replay is also an interesting result.

      Weaknesses

      The authors have not convincingly established a link between hippocampal neural activity and NE release, showing qualitative rather than quantitative correlations. Therefore, at this stage, the role of NE on hippocampal function remains speculative.

      Another general concern is that the smoothing/ kinetics of the sensor impacts the regression analyses. Most of the other variables, such as speed, acceleration, and even reward time points are highly dynamic and it is possible that the limitations of the sensor decorrelate the signal from (potentially) causal variables, therefore resulting in the time since the event start having the most explanatory power for most of the analyses.

      More broadly, the figure legends should be expanded to better describe error bounds, mean vs median, sample sizes, and averaging choices for plots.

      There are also some concerns regarding the nearest neighbor analysis and the reported differences in the rate of reactivations after familiar and novel environments, as outlined below.

      (1) Lines 657-658. How far away in time can the top three nearest neighbor time points be? Must they lie in different trials, or can they also be within the same trial? Is there a systematic difference in the average time lags for the nearest neighbors over the course of the session?

      The authors should only allow nearest neighbors to be in a different lap because systematic changes in behavior (running fast initially) might force earlier time bins in a certain location to match with a different trial, while the later time bins can be from within the same trial if the mice are moving slower and stay in the same spatial bin location longer. The authors should also provide information on how the averaging is performed because there are several axes of variability - spatial bin locations, sessions, different environments, and animals.

      (2) Figure 8: These results are very interesting. However, I am confused by the differences between Figure 8B and D because the significant reactivations in A and C are very similar. The 1-minute and 10-minute windows seem somewhat arbitrary and prone to noise and variability. Perhaps the authors should fit a slope for the curves on A and C and compare whether the slope/ intercept are significantly different between the novel and familiar environments.

    1. In this guide, you will find comprehensive information on: Brightspace Date Terminology: Understand key terms such as due dates, availability, start dates, end dates, special access, and visibility settings. Visibility Settings for Discussions: Learn how to manage visibility to keep conversations organized and accessible. Visibility Settings for Assignments: Understand how to control student access to assignment submissions. Due Dates and Availability Dates for Assignments and Quizzes: Discover best practices for setting deadlines and availability windows to enhance time management and accountability. Answers to Frequently Asked Questions: Find quick solutions to common issues related to dates and availability in Brightspace. Tips for Applying Dates in Brightspace: Get practical advice on effectively using dates to structure your course and improve communication.

      Accordion ToC?

    1. 2.10 Operating-System Debugging

      This section explores operating-system debugging, covering failure analysis, performance monitoring, and advanced tracing tools. Debugging involves identifying and fixing errors in software and hardware, with performance tuning aiming to eliminate processing bottlenecks. When a process fails, operating systems log errors and may generate core dumps for analysis, while kernel failures result in crash dumps. Debugging kernel issues is complex due to hardware control and limited debugging tools. Performance monitoring relies on counters and tracing methods. Linux provides tools like ps, top, vmstat, and /proc for tracking resource usage, while Windows uses Task Manager. Tracing tools, such as strace, gdb, and tcpdump, capture event-based data for in-depth analysis. The BCC toolkit, built on eBPF, enables secure and low-impact debugging of live systems by tracing interactions between user and kernel code. BCC tools, such as disksnoop for disk I/O and opensnoop for system calls, provide real-time insights into system performance and security without disrupting critical applications.

    2. 2.8 Operating-System Structure A system as large and complex as a modern operatin

      Operating system structure is crucial for managing complexity and ensuring functionality. Modern operating systems are typically organized into modular components, each with well-defined interfaces, similar to how programs are divided into functions. The monolithic structure, used in UNIX and Linux, combines all kernel functionalities into a single address space, offering high performance due to minimal overhead but making the system difficult to extend and maintain. In contrast, the layered approach divides the OS into distinct layers, each relying on the services of lower layers, simplifying debugging and verification but often suffering from performance overhead due to inter-layer communication. The microkernel approach, exemplified by Mach (used in macOS and iOS), minimizes the kernel by moving nonessential services to user space, communicating via message passing. This enhances security, reliability, and portability but incurs performance penalties due to message-passing overhead. For example, Windows NT initially used a microkernel but shifted to a more monolithic design to improve performance. A modern compromise is the use of loadable kernel modules (LKMs), as seen in Linux, macOS, and Windows. This approach combines the efficiency of monolithic kernels with the flexibility of modular design. Core services remain in the kernel, while additional functionalities, like device drivers or file systems, are dynamically loaded at runtime. This allows for easier updates and customization without recompiling the entire kernel, maintaining performance while enhancing modularity. Overall, the choice of structure depends on balancing performance, flexibility, and ease of maintenance. Hybrid operating systems combine different structures to balance performance, security, and usability. Linux, primarily monolithic, is modular for dynamic functionality. Windows shares monolithic traits but supports microkernel features like user-mode subsystems. Apple's macOS and iOS share the Darwin kernel, featuring Mach and BSD UNIX. macOS targets desktops with Intel chips, while iOS is optimized for mobile ARM architectures with strict security controls. Both provide Cocoa frameworks for application development. Darwin’s hybrid nature integrates Mach’s memory management and IPC with BSD’s POSIX functions. Despite Darwin’s open-source status, Apple’s proprietary frameworks, such as Cocoa, remain closed to developers. Android is an open-source mobile OS developed by Google, supporting various hardware platforms. It features a layered architecture, including the Linux kernel, Bionic C library, and Android Runtime (ART), which performs ahead-of-time (AOT) compilation for efficiency. Developers use Java with Android’s custom API, accessing hardware via the hardware abstraction layer (HAL).

    3. 2.7 Operating-System Design and Implementation In this section, we discuss problems we face in designing and implementing an operating system. There are, of course, no complete solutions to such problems, but there are approaches that have proved successful. 2.7.1 Design Goals The first problem in designing a system is to define goals and specifications. At the highest level, the design of the system will be affected by the choice of hardware and the type of system: traditional desktop/laptop, mobile, distributed, or real time. Beyond this highest design level, the requirements may be much harder to specify. The requirements can, however, be divided into two basic groups: user goals and system goals. Users want certain obvious properties in a system. The system should be convenient to use, easy to learn and to use, reliable, safe, and fast. Of course, these specifications are not particularly useful in the system design, since there is no general agreement on how to achieve them. A similar set of requirements can be defined by the developers who must design, create, maintain, and operate the system. The system should be easy to design, implement, and maintain; and it should be flexible, reliable, error free, and efficient. Again, these requirements are vague and may be interpreted in various ways. There is, in short, no unique solution to the problem of defining the requirements for an operating system. The wide range of systems in existence shows that different requirements can result in a large variety of solutions for different environments. For example, the requirements for Wind River VxWorks, a real-time operating system for embedded systems, must have been substantially different from those for Windows Server, a large multiaccess operating system designed for enterprise applications. Specifying and designing an operating system is a highly creative task. Although no textbook can tell you how to do it, general principles have been developed in the field of software engineering, and we turn now to a discussion of some of these principles. 2.7.2 Mechanisms and Policies One important principle is the separation of policy from mechanism. Mechanisms determine how to do something; policies determine what will be done. For example, the timer construct (see Section 1.4.3) is a mechanism for ensuring CPU protection, but deciding how long the timer is to be set for a particular user is a policy decision. The separation of policy and mechanism is important for flexibility. Policies are likely to change across places or over time. In the worst case, each change in policy would require a change in the underlying mechanism. A general mechanism flexible enough to work across a range of policies is preferable. A change in policy would then require redefinition of only certain parameters of the system. For instance, consider a mechanism for giving priority to certain types of programs over others. If the mechanism is properly separated from policy, it can be used either to support a policy decision that I/O-intensive programs should have priority over CPU-intensive ones or to support the opposite policy. Microkernel-based operating systems (discussed in Section 2.8.3) take the separation of mechanism and policy to one extreme by implementing a basic set of primitive building blocks. These blocks are almost policy free, allowing more advanced mechanisms and policies to be added via user-created kernel modules or user programs themselves. In contrast, consider Windows, an enormously popular commercial operating system available for over three decades. Microsoft has closely encoded both mechanism and policy into the system to enforce a global look and feel across all devices that run the Windows operating system. All applications have similar interfaces, because the interface itself is built into the kernel and system libraries. Apple has adopted a similar strategy with its macOS and iOS operating systems. We can make a similar comparison between commercial and open-source operating systems. For instance, contrast Windows, discussed above, with Linux, an open-source operating system that runs on a wide range of computing devices and has been available for over 25 years. The “standard” Linux kernel has a specific CPU scheduling algorithm (covered in Section 5.7.1), which is a mechanism that supports a certain policy. However, anyone is free to modify or replace the scheduler to support a different policy. Policy decisions are important for all resource allocation. Whenever it is necessary to decide whether or not to allocate a resource, a policy decision must be made. Whenever the question is how rather than what, it is a mechanism that must be determined. 2.7.3 Implementation Once an operating system is designed, it must be implemented. Because operating systems are collections of many programs, written by many people over a long period of time, it is difficult to make general statements about how they are implemented. Early operating systems were written in assembly language. Now, most are written in higher-level languages such as C or C++, with small amounts of the system written in assembly language. In fact, more than one higher-level language is often used. The lowest levels of the kernel might be written in assembly language and C. Higher-level routines might be written in C and C++, and system libraries might be written in C++ or even higher-level languages. Android provides a nice example: its kernel is written mostly in C with some assembly language. Most Android system libraries are written in C or C++, and its application frameworks—which provide the developer interface to the system—are written mostly in Java. We cover Android's architecture in more detail in Section 2.8.5.2. The advantages of using a higher-level language, or at least a systems-implementation language, for implementing operating systems are the same as those gained when the language is used for application programs: the code can be written faster, is more compact, and is easier to understand and debug. In addition, improvements in compiler technology will improve the generated code for the entire operating system by simple recompilation. Finally, an operating system is far easier to port to other hardware if it is written in a higher-level language. This is particularly important for operating systems that are intended to run on several different hardware systems, such as small embedded devices, Intel x86 systems, and ARM chips running on phones and tablets. The only possible disadvantages of implementing an operating system in a higher-level language are reduced speed and increased storage requirements. This, however, is not a major issue in today's systems. Although an expert assembly-language programmer can produce efficient small routines, for large programs a modern compiler can perform complex analysis and apply sophisticated optimizations that produce excellent code. Modern processors have deep pipelining and multiple functional units that can handle the details of complex dependencies much more easily than can the human mind. As is true in other systems, major performance improvements in operating systems are more likely to be the result of better data structures and algorithms than of excellent assembly-language code. In addition, although operating systems are large, only a small amount of the code is critical to high performance; the interrupt handlers, I/O manager, memory manager, and CPU scheduler are probably the most critical routines. After the system is written and is working correctly, bottlenecks can be identified and can be refactored to operate more efficiently.

      Operating system design and implementation involve defining clear goals and balancing user and system requirements. User goals focus on convenience, reliability, and speed, while system goals emphasize ease of design, flexibility, and efficiency. A key principle is separating mechanisms (how to do something) from policies (what to do), enabling flexibility and adaptability. For example, microkernel systems use minimal, policy-free mechanisms, allowing customization, while systems like Windows integrate both for consistency. Modern operating systems are typically written in higher-level languages like C or C++, with some assembly for critical parts, improving portability, maintainability, and performance. Compiler optimizations and efficient algorithms often outweigh the benefits of assembly language, making higher-level languages preferable for most OS development.

    4. 2.6 Why Applications Are Operating-System Specific Fundamentally, applications compiled on one operating system are not executable on other operating systems. If they were, the world would be a better place, and our choice of what operating system to use would depend on utility and features rather than which applications were available. Based on our earlier discussion, we can now see part of the problem—each operating system provides a unique set of system calls. System calls are part of the set of services provided by operating systems for use by applications. Even if system calls were somehow uniform, other barriers would make it difficult for us to execute application programs on different operating systems. But if you have used multiple operating systems, you may have used some of the same applications on them. How is that possible? An application can be made available to run on multiple operating systems in one of three ways: 1. The application can be written in an interpreted language (such as Python or Ruby) that has an interpreter available for multiple operating systems. The interpreter reads each line of the source program, executes equivalent instructions on the native instruction set, and calls native operating system calls. Performance suffers relative to that for native applications, and the interpreter provides only a subset of each operating system's features, possibly limiting the feature sets of the associated applications. 2. The application can be written in a language that includes a virtual machine containing the running application. The virtual machine is part of the language's full RTE. One example of this method is Java. Java has an RTE that includes a loader, byte-code verifier, and other components that load the Java application into the Java virtual machine. This RTE has been ported, or developed, for many operating systems, from mainframes to smartphones, and in theory any Java app can run within the RTE wherever it is available. Systems of this kind have disadvantages similar to those of interpreters, discussed above. 3. The application developer can use a standard language or API in which the compiler generates binaries in a machine- and operating-system-specific language. The application must be ported to each operating system on which it will run. This porting can be quite time consuming and must be done for each new version of the application, with subsequent testing and debugging. Perhaps the best-known example is the POSIX API and its set of standards for maintaining source-code compatibility between different variants of UNIX-like operating systems. In theory, these three approaches seemingly provide simple solutions for developing applications that can run across different operating systems. However, the general lack of application mobility has several causes, all of which still make developing cross-platform applications a challenging task. At the application level, the libraries provided with the operating system contain APIs to provide features like GUI interfaces, and an application designed to call one set of APIs (say, those available from IOS on the Apple iPhone) will not work on an operating system that does not provide those APIs (such as Android). Other challenges exist at lower levels in the system, including the following. Each operating system has a binary format for applications that dictates the layout of the header, instructions, and variables. Those components need to be at certain locations in specified structures within an executable file so the operating system can open the file and load the application for proper execution. CPUs have varying instruction sets, and only applications containing the appropriate instructions can execute correctly. Operating systems provide system calls that allow applications to request various activities, such as creating files and opening network connections. Those system calls vary among operating systems in many respects, including the specific operands and operand ordering used, how an application invokes the system calls, their numbering and number, their meanings, and their return of results. There are some approaches that have helped address, though not completely solve, these architectural differences. For example, Linux—and almost every UNIX system—has adopted the ELF format for binary executable files. Although ELF provides a common standard across Linux and UNIX systems, the ELF format is not tied to any specific computer architecture, so it does not guarantee that an executable file will run across different hardware platforms. APIs, as mentioned above, specify certain functions at the application level. At the architecture level, an application binary interface (ABI) is used to define how different components of binary code can interface for a given operating system on a given architecture. An ABI specifies low-level details, including address width, methods of passing parameters to system calls, the organization of the run-time stack, the binary format of system libraries, and the size of data types, just to name a few. Typically, an ABI is specified for a given architecture (for example, there is an ABI for the ARMv8 processor). Thus, an ABI is the architecture-level equivalent of an API. If a binary executable file has been compiled and linked according to a particular ABI, it should be able to run on different systems that support that ABI. However, because a particular ABI is defined for a certain operating system running on a given architecture, ABIs do little to provide cross-platform compatibility. In sum, all of these differences mean that unless an interpreter, RTE, or binary executable file is written for and compiled on a specific operating system on a specific CPU type (such as Intel x86 or ARMv8), the application will fail to run. Imagine the amount of work that is required for a program such as the Firefox browser to run on Windows, macOS, various Linux releases, iOS, and Android, sometimes on various CPU architectures.

      Applications are often operating-system specific due to differences in system calls, binary formats, and CPU instruction sets. System calls, which enable applications to interact with the OS, vary across platforms, making cross-platform execution challenging. Three approaches enable multi-OS compatibility: 1) Interpreted languages (e.g., Python) use interpreters to execute code on different OSes, though performance and feature sets may be limited. 2) Virtual machines (e.g., Java) run applications within a portable runtime environment, but with similar limitations. 3) Porting applications to each OS using standard APIs (e.g., POSIX) is time-consuming. Binary formats (e.g., ELF, PE) and ABIs further complicate cross-platform compatibility, as they are tied to specific architectures and OSes. These factors make developing cross-platform applications, like Firefox, a complex task requiring significant adaptation for each OS and CPU architecture.

    5. 2.5 Linkers and Loaders Usually, a program resides on disk as a binary executable file—for example, a.out or prog.exe. To run on a CPU, the program must be brought into memory and placed in the context of a process. In this section, we describe the steps in this procedure, from compiling a program to placing it in memory, where it becomes eligible to run on an available CPU core. The steps are highlighted in Figure 2.11. Figure 2.11 The role of the linker and loader. Source files are compiled into object files that are designed to be loaded into any physical memory location, a format known as an relocatable object file. Next, the linker combines these relocatable object files into a single binary executable file. During the linking phase, other object files or libraries may be included as well, such as the standard C or math library (specified with the flag -lm). A loader is used to load the binary executable file into memory, where it is eligible to run on a CPU core. An activity associated with linking and loading is relocation, which assigns final addresses to the program parts and adjusts code and data in the program to match those addresses so that, for example, the code can call library functions and access its variables as it executes. In Figure 2.11, we see that to run the loader, all that is necessary is to enter the name of the executable file on the command line. When a program name is entered on the command line on UNIX systems—for example, ./main—the shell first creates a new process to run the program using the fork() system call. The shell then invokes the loader with the exec() system call, passing exec() the name of the executable file. The loader then loads the specified program into memory using the address space of the newly created process. (When a GUI interface is used, double-clicking on the icon associated with the executable file invokes the loader using a similar mechanism.) The process described thus far assumes that all libraries are linked into the executable file and loaded into memory. In reality, most systems allow a program to dynamically link libraries as the program is loaded. Windows, for instance, supports dynamically linked libraries (DLLs). The benefit of this approach is that it avoids linking and loading libraries that may end up not being used into an executable file. Instead, the library is conditionally linked and is loaded if it is required during program run time. For example, in Figure 2.11, the math library is not linked into the executable file main. Rather, the linker inserts relocation information that allows it to be dynamically linked and loaded as the program is loaded. We shall see in Chapter 9 that it is possible for multiple processes to share dynamically linked libraries, resulting in a significant savings in memory use. Object files and executable files typically have standard formats that include the compiled machine code and a symbol table containing metadata about functions and variables that are referenced in the program. For UNIX and Linux systems, this standard format is known as ELF (for Executable and Linkable Format). There are separate ELF formats for relocatable and executable files. One piece of information in the ELF file for executable files is the program's entry point, which contains the address of the first instruction to be executed when the program runs. Windows systems use the Portable Executable (PE) format, and macOS uses the Mach-O format. ELF FORMAT Linux provides various commands to identify and evaluate ELF files. For example, the file command determines a file type. If main.o is an object file, and main is an executable file, the command file main.o will report that main.o is an ELF relocatable file, while the command file main will report that main is an ELF executable. ELF files are divided into a number of sections and can be evaluated using the readelf command.

      Linkers and loaders play a crucial role in transforming a program from a disk-based binary executable (e.g., a.out or prog.exe) into a memory-resident process ready for CPU execution. Source files are compiled into relocatable object files, which the linker combines into a single executable, incorporating libraries like the standard C library. The loader then loads this executable into memory, adjusting addresses through relocation to enable proper function calls and variable access. On UNIX systems, the shell uses fork() and exec() system calls to create a process and invoke the loader. Dynamic linking, as seen with Windows DLLs, allows libraries to be linked and loaded only when needed, saving memory. Executable files follow standard formats like ELF (Linux), PE (Windows), or Mach-O (macOS), containing machine code, symbol tables, and entry points. Tools like the file and readelf commands help analyze ELF files, distinguishing between relocatable and executable formats.

    6. 2.4 System Services Another aspect of a modern system is its collection of system services. Recall Figure 1.1, which depicted the logical computer hierarchy. At the lowest level is hardware. Next is the operating system, then the system services, and finally the application programs. System services, also known as system utilities, provide a convenient environment for program development and execution. Some of them are simply user interfaces to system calls. Others are considerably more complex. They can be divided into these categories: File management. These programs create, delete, copy, rename, print, list, and generally access and manipulate files and directories. Status information. Some programs simply ask the system for the date, time, amount of available memory or disk space, number of users, or similar status information. Others are more complex, providing detailed performance, logging, and debugging information. Typically, these programs format and print the output to the terminal or other output devices or files or display it in a window of the GUI. Some systems also support a registry, which is used to store and retrieve configuration information. File modification. Several text editors may be available to create and modify the content of files stored on disk or other storage devices. There may also be special commands to search contents of files or perform transformations of the text. Programming-language support. Compilers, assemblers, debuggers, and interpreters for common programming languages (such as C, C++, Java, and Python) are often provided with the operating system or available as a separate download. Program loading and execution. Once a program is assembled or compiled, it must be loaded into memory to be executed. The system may provide absolute loaders, relocatable loaders, linkage editors, and overlay loaders. Debugging systems for either higher-level languages or machine language are needed as well. Communications. These programs provide the mechanism for creating virtual connections among processes, users, and computer systems. They allow users to send messages to one another's screens, to browse web pages, to send e-mail messages, to log in remotely, or to transfer files from one machine to another. Background services. All general-purpose systems have methods for launching certain system-program processes at boot time. Some of these processes terminate after completing their tasks, while others continue to run until the system is halted. Constantly running system-program processes are known as services, subsystems, or daemons. One example is the network daemon discussed in Section 2.3.3.5. In that example, a system needed a service to listen for network connections in order to connect those requests to the correct processes. Other examples include process schedulers that start processes according to a specified schedule, system error monitoring services, and print servers. Typical systems have dozens of daemons. In addition, operating systems that run important activities in user context rather than in kernel context may use daemons to run these activities. Along with system programs, most operating systems are supplied with programs that are useful in solving common problems or performing common operations. Such application programs include web browsers, word processors and text formatters, spreadsheets, database systems, compilers, plotting and statistical-analysis packages, and games. The view of the operating system seen by most users is defined by the application and system programs, rather than by the actual system calls. Consider a user's PC. When a user's computer is running the macOS operating system, the user might see the GUI, featuring a mouse-and-windows interface. Alternatively, or even in one of the windows, the user might have a command-line UNIX shell. Both use the same set of system calls, but the system calls look different and act in different ways. Further confusing the user view, consider the user dual-booting from macOS into Windows. Now the same user on the same hardware has two entirely different interfaces and two sets of applications using the same physical resources. On the same hardware, then, a user can be exposed to multiple user interfaces sequentially or concurrently.

      System services, or utilities, enhance program development and execution by offering a structured environment. They include file management tools for manipulating files, status information programs for system details, and file modification utilities like text editors. Programming support features compilers and debuggers, while program loading tools manage execution. Communication programs enable virtual connections, and background services, or daemons, run essential processes continuously. These services, alongside application programs, shape the user’s perception of the operating system, often masking the underlying system calls. Different interfaces, like macOS GUI or UNIX shell, can coexist on the same hardware, offering varied user experiences despite shared resources.

    7. 2.3 System Calls

      System calls act as a bridge between user applications and the operating system, enabling access to essential system services. The section elaborates on their role through an example of copying data from one file to another, illustrating how system calls facilitate file operations, error handling, and user interaction. It highlights how even simple tasks require multiple system calls, such as reading input, opening files, writing data, and handling errors. The discussion extends to APIs, which simplify system call usage for developers, ensuring portability and ease of development. The role of the runtime environment (RTE) in managing system calls is also explored. Various system call types; process control, file management, device management, information maintenance, communications, and protection; are categorized with examples from Windows and UNIX. File management involves a set of essential system calls that enable users to create, delete, open, read, write, reposition, and close files. These operations also extend to directories, facilitating structured organization within a file system. File attributes, such as name, type, and protection codes, can be retrieved or modified using get_file_attributes() and set_file_attributes() system calls. Additionally, some operating systems provide built-in move() and copy() calls, while others integrate these functionalities through APIs. The close relationship between files and devices is evident in UNIX-like systems, where both share a unified interface. Device management encompasses requesting and releasing devices, which prevents conflicts such as deadlocks. System calls also support debugging, retrieving system information, and process monitoring. Communication between processes occurs through message-passing or shared memory, each with distinct advantages. Finally, system protection mechanisms, including permission controls, ensure secure resource access, safeguarding data in multiprogrammed and networked environments.

    8. 2.2 User and Operating-System Interface

      The user interface of an operating system determines how users interact with it. There are three primary methods: the command-line interface (CLI), the graphical user interface (GUI), and the touch-screen interface. Each has its unique advantages and use cases. The command interpreter, or shell, allows users to enter commands directly. Operating systems like Linux, UNIX, and Windows provide different shells, such as the Bash shell, which execute user commands. These commands can be built-in or executed via system programs. UNIX, for instance, runs external programs to process commands, making it easier to add new commands without modifying the shell. The graphical user interface (GUI), on the other hand, offers a more intuitive approach by using windows, icons, and menus. It was first developed at Xerox PARC in the 1970s and later popularized by Apple Macintosh and Microsoft Windows. GUIs allow users to interact with applications by clicking icons and navigating menus, significantly simplifying the user experience compared to text-based commands. For mobile devices, the touch-screen interface replaces both CLI and GUI with direct touch-based interactions. Users tap, swipe, or use gestures to navigate through applications. The iPhone and iPad, for example, use the Springboard interface to manage apps and settings, minimizing the need for external input devices like keyboards or mice. Ultimately, the choice between these interfaces depends on user preference and system functionality. Power users and administrators prefer CLI for efficiency and automation, while everyday users favor GUIs and touch interfaces for their ease of use.

    9. Virtualization is a technology that allows us to abstract the hardware of a single computer (the CPU, memory, disk drives, network interface cards, and so forth) into several different execution environments, thereby creating the illusion that each separate environment is running on its own private computer. These environments can be viewed as different individual operating systems (for example, Windows and UNIX) that may be running at the same time and may interact with each other. A user of a virtual machine can switch among the various operating systems in the same way a user can switch among the various processes running concurrently in a single operating system. Virtualization allows operating systems to run as applications within other operating systems. At first blush, there seems to be little reason for such functionality. But the virtualization industry is vast and growing, which is a testament to its utility and importance. Broadly speaking, virtualization software is one member of a class that also includes emulation. Emulation, which involves simulating computer hardware in software, is typically used when the source CPU type is different from the target CPU type. For example, when Apple switched from the IBM Power CPU to the Intel x86 CPU for its desktop and laptop computers, it included an emulation facility called “Rosetta,” which allowed applications compiled for the IBM CPU to run on the Intel CPU. That same concept can be extended to allow an entire operating system written for one platform to run on another. Emulation comes at a heavy price, however. Every machine-level instruction that runs natively on the source system must be translated to the equivalent function on the target system, frequently resulting in several target instructions. If the source and target CPUs have similar performance levels, the emulated code may run much more slowly than the native code. With virtualization, in contrast, an operating system that is natively compiled for a particular CPU architecture runs within another operating system also native to that CPU. Virtualization first came about on IBM mainframes as a method for multiple users to run tasks concurrently. Running multiple virtual machines allowed (and still allows) many users to run tasks on a system designed for a single user. Later, in response to problems with running multiple Microsoft Windows applications on the Intel x86 CPU, VMware created a new virtualization technology in the form of an application that ran on Windows. That application ran one or more guest copies of Windows or other native x86 operating systems, each running its own applications. (See Figure 1.16.) Windows was the host operating system, and the VMware application was the virtual machine manager (VMM). The VMM runs the guest operating systems, manages their resource use, and protects each guest from the others.

      The Overview of Cloud Computing Using virtualization to its advantage, cloud computing provides on-demand online access to computers, storage, and software. The resources used determine how much users pay. Public, private, hybrid, and SaaS, PaaS, and IaaS cloud types are among them. These settings can integrate several kinds of cloud services.

      Integrated Systems For certain tasks, embedded systems are customized computing devices with constrained interfaces. In order to satisfy stringent timing constraints, they frequently use real-time operating systems. These systems need to process sensor data quickly and respond to it. They are utilized in a variety of industries, including industrial automation, medical devices, and automobile control.

      Operating Systems That Are Free and Open-Source GNU/Linux is one example of an open-source operating system that makes its source code available for free alteration and dissemination. The push for free software encourages user liberties,

    1. Ethnography is the chief method of cultural anthropology, the study of human cultures and societies.

      Cultural anthropology and ethnography offers one of the more fascinating windows into the human psyche and it's "trainability." We generally make the mistake of thinking the way we are is the only way we could be. For example, I heard recently of a professed Christian who refuses to believe he'd be a Muslim if he had been born in a Muslim country. Silliness, of course. Foreign kids adopted by American families grow up full American. While there are certainly genetic predispositions (i.e., evidenced by studies of separated twins), I think we are basically born as an unprogrammed computer waiting for our parents and society to load an operating program.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents convincing findings that oligodendrocytes play a regulatory role in spontaneous neural activity synchronisation during early postnatal development, with implications for adult brain function. Utilising targeted genetic approaches, the authors demonstrate how oligodendrocyte depletion impacts Purkinje cell activity and behaviours dependent on cerebellar function. Delayed myelination during critical developmental windows is linked to persistent alterations in neural circuit function, underscoring the lasting impact of oligodendrocyte activity.

      Strengths:

      (1) The research leverages the anatomically distinct olivocerebellar circuit, a well-characterized system with known developmental timelines and inputs, strengthening the link between oligodendrocyte function and neural synchronization.

      (2) Functional assessments, supported by behavioral tests, validate the findings of in vivo calcium imaging, enhancing the study's credibility.

      (3) Extending the study to assess the long-term effects of early-life myelination disruptions adds depth to the implications for both circuit function and behavior.

      Weaknesses:

      (1) The study would benefit from a closer analysis of myelination during the periods when synchrony is recorded. Direct correlations between myelination and synchronized activity would substantiate the mechanistic link and clarify if observed behavioral deficits stem from altered myelination timing.

      (2) Although the study focuses on Purkinje cells in the cerebellum, neural synchrony typically involves cross-regional interactions. Expanding the discussion on how localized Purkinje synchrony affects broader behaviors - such as anxiety, motor function, and sociality - would enhance the findings' functional significance.

      (3) The authors discuss the possibility of oligodendrocyte-mediated synapse elimination as a possible mechanism behind their findings, drawing from relevant recent literature on oligodendrocyte precursor cells. However, there are no data presented supporting this assumption. The authors should explain why they think the mechanism behind their observation extends beyond the contribution of myelination or remove this point from the discussion entirely.

      (4) It would be valuable to investigate the secondary effects of oligodendrocyte depletion on other glial cells, particularly astrocytes or microglia, which could influence long-term behavioral outcomes. Identifying whether the lasting effects stem from developmental oligodendrocyte function alone or also involve myelination could deepen the study's insights.

      (5) The authors should explore the use of different methods to disturb myelin production for a longer time, in order to further determine if the observed effects are transient or if they could have longer-lasting effects.

      (6) Throughout the paper, there are concerns about statistical analyses, particularly on the use of the Mann-Whitney test or using fields of view as biological replicates.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors use genetic tools to ablate oligodendrocytes in the cerebellum during postnatal development. They show that the oligodendrocyte numbers return to normal post-weaning. Yet, the loss of oligodendrocytes during development seems to result in decreased synchrony of calcium transients in Purkinje neurons across the cerebellum. Further, there were deficits in social behaviors and motor coordination. Finally, they suppress activity in a subset of climbing fibers to show that it results in similar phenotypes in the calcium signaling and behavioral assays. They conclude that the behavioral deficits in the oligodendrocyte ablation experiments must result from loss of synchrony.

      Strengths:

      Use of genetic tools to induce perturbations in a spatiotemporally specific manner.

      Weaknesses:

      The main weakness in this manuscript is the lack of a cohesive causal connection between the experimental manipulation performed and the phenotypes observed. Though they have taken great care to induce oligodendrocyte loss specifically in the cerebellum and at specific time windows, the subsequent experiments do not address specific questions regarding the effect of this manipulation. Calcium transients in Purkinje neurons are caused to a large extent by climbing fibers, but there is evidence for simple spikes to also underlie the dF/F signatures (Ramirez and Stell, Cell Reports, 2016). Also, it is erroneous to categorize these calcium signals as signatures of "spontaneous activity" of Purkinje neurons as they can have dual origins. Further, the effect of developmental oligodendrocyte ablation on the cerebellum has been previously reported by Mathis et al., Development, 2003. They report very severe effects such as the loss of molecular layer interneurons, stunted Purkinje neuron dendritic arbors, abnormal foliations, etc. In this context, it is hardly surprising that one would observe a reduction of synchrony in Purkinje neurons (perhaps due to loss of synaptic contacts, not only from CFs but also from granule cells). The last experiment with the expression of Kir2.1 in the inferior olive is hardly convincing. In summary, while the authors used a specific tool to probe the role of developmental oligodendrocytes in cerebellar physiology and function, they failed to answer specific questions regarding this role, which they could have done with more fine-grained experimental analysis.

    1. Does that make them history? And if so, can we understand them, as some politicians in the nineties apparently did, as transparent windows to the world of the past?

      I think to a point, these programs can be considered history. Some shows accurately tell the stories of their time period, while also highlighting fictional characters and stories surrounding them. They can indirectly represent the ideologies of the time period and address sensitive topics.

    1. DeepSeekMLA was an even bigger breakthrough. One of the biggest limitations on inference is the sheer amount of memory required: you both need to load the model into memory and also load the entire context window. Context windows are particularly expensive in terms of memory, as every token requires both a key and corresponding value; DeepSeekMLA, or multi-head latent attention, makes it possible to compress the key-value store, dramatically decreasing memory usage during inference.

      Multi-head Latent Attention

      Compress the key-value store of tokens, which decreases memory usage during inferencing.

    1. It is a big, airy room, the whole floor nearly, with windows that look all ways, and air and sunshine galore.

      I enjoy the symbolism of this statement. You can really imagine what the speaker is talking about.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The hypothesis is based on the idea that inversions capture genetic variants that have antagonistic effects on male sexual success (via some display traits) and survival of females (or both sexes) until reproduction. Furthermore, a sufficiently skewed distribution of male sexual success will tend to generate synergistic epistasis for male fitness even if the individual loci contribute to sexually selected traits in an additive way. This should favor inversions that keep these male-beneficial alleles at different loci together at a cis-LD. A series of simulations are presented and show that the scenario works at least under some conditions. While a polymorphism at a single locus with large antagonistic effects can be maintained for a certain range of parameters, a second such variant with somewhat smaller effects tends to be lost unless closely linked. It becomes much more likely for genomically distant variants that add to the antagonism to spread if they get trapped in an inversion; the model predicts this should drive accumulation of sexually antagonistic variants on the inversion versus standard haplotype, leading to the evolution of haplotypes with very strong cumulative antagonistic pleiotropic effects. This idea has some analogies with one of predominant hypotheses for the evolution of sex chromosomes, and the authors discuss these similarities. The model is quite specific, but the basic idea is intuitive and thus should be robust to the details of model assumption. It makes perfect sense in the context of the geographic pattern of inversion frequencies. One prediction of the models (notably that leads to the evolution of nearly homozygously lethal haplotypes) does not seem to reflect the reality of chromosomal inversions in Drosophila, as the authors carefully discuss, but it is the case of some other "supergenes", notably in ants. So the theoretical part is a strong novel contribution.

      We appreciate the detailed and accurate summary of our main theoretic results.

      To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.

      As I have argued in my comments on previous versions, the experiment only addresses one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on male reproductive success and other fitness components, in particular of females. Furthermore, the design of this experiment is not ideal from the viewpoint of the biological hypothesis it is aiming to test. This is in part because, rather than testing for the effects of inversion on male reproductive success versus the key fitness components of survival to maturity and female reproductive output, it looks at the effects on male reproductive success versus survival to a rather old age of 2 months. The relevance of survival until old age to fitness under natural conditions is unclear, as the authors now acknowledge. Furthermore, up to 15% of males that may have contributed to the next generation did not survive until genotyping, and thus the difference between these males' inversion frequency and that in their offspring may be confounded by this potential survival-based sampling bias. The experiment does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing for synergistic epistasis would be exceedingly difficult, and the authors have now included a discussion of the above caveats and limitations, making their conclusions more tentative. This is good but of course does not make these limitations of the experiment go away. These limitations mean that the paper is stronger as a theoretical than as an empirical contribution.

      We discuss the choice to focus on exploring the potential antagonistic effects of the inversion karyotype on male reproductive success and survival in our general response above. Primarily, this prediction seemed to be the most specific to the proposed model as compared to other alternate models. Still, further studies are clearly needed to elucidate the potential frequency dependence and genetic architecture of the inversions.

      Regarding the choice of age at collection, it is unknown to what degree our selected collection age of 10 weeks correlates with survival in the wild, but we feel confident that there will be some positive correlation.

      We now further clarify that across our experiments, a minimum of 5% and a mean of 9% of the males used in the parental generation died before collection. These proportions do not appear sufficient to explain the differences between paternal and embryo inversion frequencies shown in Figure 9.

      Reviewer #2 (Public review):

      Summary:

      In their manuscript the authors address the question whether the inversion polymorphism in D. melanogaster can be explained by sexually antagonistic selection. They designed a new simulation tool to perform computer simulations, which confirmed their hypothesis. They also show a tradeoff between male reproduction and survival. Furthermore, some inversions display sex-specific survival.

      Strengths:

      It is an interesting idea on how chromosomal inversions may be maintained

      Weaknesses:

      The authors motivate their study by the observation that inversions are maintained in D. melanogaster and because inversions are more frequent closer to the equator, the authors conclude that it is unlikely that the inversion contributes to adaptation in more stressful environments. Rather the inversion seems to be more common in habitats that are closer to the native environment of ancestral Drosophila populations.

      While I do agree with the authors that this observation is interesting, I do not think that it rules out a role in local adaptation. After all, the inversion is common in Africa, so it is perfectly conceivable that the non-inverted chromosome may have acquired a mutation contributing to the novel environment.

      Based on their hypothesis, the authors propose an alternative strategy, which could maintain the inversion in a population. They perform some computer simulations, which are in line with the predicted behavior. Finally, the authors perform experiments and interpret the results as empirical evidence for their hypothesis. While the reviewer is not fully convinced about the empirical support, the key problem is that the proposed model does not explain the patterns of clinal variation observed for inversions in D. melanogaster. According to the proposed model, the inversions should have a similar frequency along latitudinal clines. So in essence, the authors develop a complicated theory because they felt that the current models do not explain the patterns of clinal variation, but this model also fails to explain the pattern of clinal variation.

      To the contrary – in the Discussion paragraph beginning on Line 671, we explain why we would predict that a tradeoff between survival and reproduction should lead to clinal inversion frequencies. We suggest that a karyotype associated with a survival penalty should be increasingly disadvantageous in more challenging environments (such as high altitudes and latitudes for this species). Furthermore, an advantage in male reproductive competition conferred by that same haplotype may be reduced by the lower population densities that we would expect in more challenging environments (meaning that each female should encounter fewer males). Individually or jointly, these two factors predict that the equilibrium frequency of a balanced inversion frequency polymorphism should depend on a local population’s environmental harshness and population density, with the ensuing prediction that inversion frequency should correlate with certain environmental variables.

      Reviewer #3 (Public review):

      Summary:

      In this study, McAllester and Pool develop a new model to explain the maintenance of balanced inversion polymorphism, based on (sexually) antagonistic alleles and a trade-off between male reproduction and survival (in females or both sexes). Simulations of this model support the plausibility of this mechanism. In addition, the authors use experiments on four naturally occurring inversion polymorphisms in D. melanogaster and find tentative evidence for one aspect of their theoretical model, namely the existence of the above-mentioned trade-off in two out of the four inversions.

      Strengths:

      (1) The study develops and analyzes a new (Drosophila melanogaster-inspired) model for the maintenance of balanced inversion polymorphism, combining elements of (sexually) antagonistically (pleiotropic) alleles, negative frequency-dependent selection and synergistic epistasis. Simulations of the model suggest that the hypothesized mechanism might be plausible.

      (2) The above-mentioned model assumes, as a specific example, a trade-off between male reproductive display and survival; in the second part of their study, the authors perform laboratory experiments on four common D. melanogaster inversions to study whether these polymorphisms may be subject to such a trade-off. The authors observe that two of the four inversions show suggestive evidence that is consistent with a trade-off between male reproduction and survival.

      Open issues:

      (1) A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would thus be important and interesting, as the authors mention, to fill this gap in future work.

      (2) It will also be important to further explore and corroborate the potential importance and generality of trade-offs between different fitness components in maintaining inversion polymorphisms in future work.

      We appreciate the work put in to evaluating, improving, and summarizing our study. We agree that further work studying the effects of dominance and of the fitness components of the inversions is important.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      l. 354 : I don't understand what the authors mean by "an antagonistic and non-antagonistic allele". If there is a antagonistic polymorphism at a locus, then both alleles have antagonistic effects; i.e., allele B increases trait 1 and reduced trait 2 relative to allele A and vice versa.

      Edited, agreed that the terminology used here was sub-optimal.

      Reviewer #2 (Recommendations for the authors):

      The motivation for their model is their claim that the clinal inversion frequencies are not compatible with local adaptation. The reviewer doubts this strong statement. Furthermore, the proposed model also fails to explain the inversion frequencies in natural populations.

      Hence, rather than building a straw man, it would be better if the authors first show their experiments and then present their model as an explanation for the empirical results. Nevertheless, it is also clear that the empirical data are not very strong and cannot be fully explained by the proposed model.

      This claim that we reject any role of local adaptation in clinal variation and selection upon inversion polymorphism does not hold up in a reading of our manuscript. We even suggest that locally varying selective pressures must be playing some role, although that does not imply that local adaptation is the ultimate driver of inversion frequencies. Indeed, we suggest that local adaptation alone is an insufficient explanation for inversion frequency clines in D. melanogaster, including because (1) these frequency clines do not approach the alternate fixed genotypes predicted by local directional selection, (2) these derived inversions tend to be more frequent in more ancestral environments (l.113-158).

      In our public review response above, and in the Discussion section of our paper, we explain why our model can predict both the clinal frequencies of many Drosophila inversions and their intermediate maximal frequencies. Of course, we do not predict that most inversions in this species should follow the specific tradeoff investigated here. In fact, we were surprised to find even two inversions that experimentally supported our predicted tradeoff. Still, it remains possible that other inversions in this species are subject to other balanced tradeoffs not investigated here, which could help explain why they rarely reach high local frequencies.

      Reviewer #3 (Recommendations for the authors):

      My previous comments have been adequately addressed.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      […]

      To provide empirical support for this idea, the authors study the dynamics of inversions in population cages over one generation, tracking their frequencies through amplicon sequencing at three time points: (young adults), embryos and very old adult offspring of either sex (>2 months from adult emergence). Out of four inversions included in the experiment, two show patterns consistent with antagonistic effects on male sexual success (competitive paternity) and the survival of offspring, especially females, until an old age, which the authors interpret as consistent with their theory.

      There are several reasons why the support from these data for the proposed theory is not waterproof.

      (1) As I have already pointed out in my previous review, survival until 2 months (in fact, it is 10 weeks and so 2.3 months) of age is of little direct relevance to fitness, whether under natural conditions or under typical lab conditions.

      The authors argue this objection away with two arguments

      First, citing Pool (2015) they claim that the average generation time (i.e. the average age at which flies reproduce) in nature is 24 days. That paper made an estimate of 14.7 generations per year under the North Carolina climate. As also stated in Pool (2015), the conditions in that locality for Drosophila reproduction and development are not suitable during three months of the year. This yields an average generation length of about 19.5 days during the 9 months during which the flies can reproduce. On the highly nutritional food used in the lab and at the optimal temperature of 25 C, Drosophila need about 11-12 days to develop from egg to adult. Even assuming these perfect conditions, the average age (counted from adult eclosion) would be about 8 days. In practice, larval development in nature is likely longer for nutritional and temperature reasons, and thus the genomic data analyzed by Pool imply that the average adult age of reproducing flies in nature would be about 5 days, and not 24 days, and even less 10 weeks. This corresponds neatly to the 2-6 days median life expectancy of Drosophila adults in the field based on capture-recapture (e.g., Rosewell and Shorrocks 1987).

      Second, the authors also claim that survival over a period of 2 month is highly relevant because flies have to survive long periods where reproduction is not possible. However, to survive the winter flies enter a reproductive diapause, which involves profound physiological changes that indeed allow them to survive for months, remaining mostly inactive, stress resistant and hidden from predators. Flies in the authors' experiment were not diapausing, given that they were given plentiful food and kept warm. It is still possible that survival to the ripe old age of 10 weeks under these conditions still correlates well with surviving diapause under harsh conditions, but if so, the authors should cite relevant data. Even then, I do not think this allows the authors to conclude that longevity is "the main selective pressure" on Drosophila (l. 936).

      This is overall a thoughtfully presented critique and we have endeavored to improve our discussion of Pool (2015) and to clarify some of the language used about survival elsewhere. While we agree that challenges other than survival to 10 weeks are very relevant to Drosophila melanogaster, collection at 10 weeks does encompass some of these other challenges. Egg to adult viability still contributes to the frequencies of the inversions at collection and is not separable from longevity in this data. Collection at longevity was chosen in part to encompass all lifetime fitness challenges that might influence the inversion frequency at collection, albeit still within permissive laboratory conditions. Future experiments exploring specific stressors independently and beyond permissive lab conditions would generate a clearer picture.

      In addition to general edits, the specific phrase mentioned at 1. 936 [now line 1003] has been revised from “In many such cases females are in reproductive diapause, and so longevity is the main selective pressure.” to “While longevity is a key selective pressure underlying overwintering, the relationship between longevity in permissive lab conditions without diapause and in natural conditions under diapause is unclear (Schmidt et al. 2005; Flatt 2020), and our experiment represents just one of many possible ways to examine tradeoffs involving survival.”

      (2) It appears that the "parental" (in fact, paternal) inversion frequency was estimated by sequencing sires that survived until the end of the two-week mating period. No information is provided on male mortality during the mating period, but substantial mortality is likely given constant courtship and mating opportunities. If so, the difference between the parental and embryo inversion frequency could reflect the differential survival of males until the point of sampling rather than / in addition to sexual selection.

      We have further clarified that when referenced as parental frequency, the frequency presented is ½ the paternal frequency as the mothers were homokaryotypic for the standard arrangement. We chose to present both due to considerations in representing the frequency change from paternal to embryo frequencies, where a hypothetical change from 0.20 frequency in fathers to 0.15 frequency in embryos represents a selective benefit (a frequency increase in the population), despite the reality that this is a decrease in allele frequency between paternal and embryo cohorts.

      We mentioned a maximum 15% paternal mortality at line 827 [now l.1056], but have now added complete data on the counts of flies in the experiment as a supplemental table (Table S1) and have added or corrected further references to this in the results and methods [lines 555, 638, 975]. It is true that this may influence the observed frequency changes to some degree, and while we adjusted our sampling method to account for the effects of this mortality on statistical power [l.1056ff], we have now edited the manuscript to better highlight potential effects of this phenomenon on the recorded frequency changes.

      It is also worth noting that, if mortality among fathers over the mating period is codirectional with mortality among aged offspring, this would bias the results against detecting an opposing antagonistic selective effect of the inversions on paternity share. This is now also mentioned in the manuscript, l.639ff.

      (3) Finally, irrespective of the above caveats, the experimental data only address one of the elements of the theoretical hypothesis, namely antagonistic effects of inversions on reproduction and survival, notably that of females. It does not test for two other key elements of the proposed theory: the assumption of frequency-dependence of selection on male sexual success, and the prediction of synergistic epistasis for male fitness among genetic variants in the inversion. To be fair, particularly testing the latter prediction would be exceedingly difficult. Nonetheless, these limitations of the experiment mean that the paper is much stronger theoretical than empirical contribution.

      This is a fair criticism of the limitations of our results, and we now summarize such caveats more directly in the discussion summary, lines 876ff.

      Reviewer #2 (Public Review): 

      […]

      Comments on the latest version:

      I would like to give an example of the confusing terminology of the authors:

      "Additionally, fitness conveyed by an allele favoring display quality is also frequency-dependent: since mating success depends on the display qualities of other males, the relative advantage of a display trait will be diminished as more males carry it..."

      I do not understand the difference to an advantageous allele, as it increases in frequency the frequency increase of this allele decreases, but this has nothing to do with frequency dependent selection. In my opinion, the authors re-define frequency dependent selection, as for frequency dependent selection needs to change with frequency, but from their verbal description this is not clear.

      We have edited this text for greater clarity, now line 232ff. We did not seek to redefine frequency dependence, and did mean by “the relative advantage of a display trait will be diminished” that an equivalent s would diminish with frequency. We have now remedied terminological issues introduced in the prior revision with regard to frequency dependent selection.

      One example of how challenging the style of the manuscript is comes from their description of the DNA extraction procedure. In principle a straightforward method, but even here the authors provide a convoluted uninformative description of the procedure.

      We have edited for clarity the text on lines 1016-1020. Citing a published protocol and mentioning our modifications seems an appropriate trade-off between representing what was done accurately, citing the sources we relied on in doing it, and limiting the volume of information in the main text for such a straightforward and common method. 

      It is not apparent to the reviewer why the authors have not invested more effort to make their manuscript digestible.

      We have invested a great deal of effort in making this manuscript as clear as we are able to.  We regret that our writing has not been to this reviewer’s liking. We believe we have been highly responsive to all specific criticisms, including revising all passages cited as unclear. In this round, we have again scrutinized the entire manuscript for any opportunity to clarify it, and we have made further changes throughout.  Although our subject matter is conceptually nuanced, we nevertheless remain optimistic that a careful, fresh reading of our revised manuscript would yield a more favorable impression.

      Reviewer #3 (Public Review):

      […]

      Weaknesses:

      A gap in the current modeling is that, while a diploid situation is being studied, the model does not investigate the effects of varying degrees of dominance. It would be important and interesting to fill this gap in future work.

      Agreed, and now reinforced at lines 892ff.

      Comments on the latest version:

      Most of the comments which I have made in my public review have been adequately addressed.

      Some of the writing still seems somewhat verbose and perhaps not yet maximally succinct; some additional line-by-line polishing might still be helpful at this stage in terms of further improving clarity and flow (for the authors to consider and decide).

      We have made further changes and some polishing in this draft, and greatly appreciate the guidance provided in improving the draft so far. 

      Reviewer #1 (Recommendations For The Authors):

      (1) While the model results are convincing, some of the verbal interpretation is confusing. In particular, the authors state that in their model the allele favoring male display quality shows a negative frequency dependence whereas the alternative allele has a positive frequency dependence. This does not make sense to me in the context of population genetics theory. For a one-locus, two-allele model the change of allele frequency under selection depends on the fitness of the genotypes concerned relative to each other. Thus, at least under no dominance assumed in this model, if the relative fitness of AA decreases with the frequency of allele A, the relative fitness of aa must decrease with the frequency of allele a. I.e., if selection is negatively frequency dependent, then it is so for both alleles.

      This phrasing was wrong, and we have edited the relevant section.

      (2) I am still not entirely sure that the synergistic epistasis assumed in the verbal model is actually generated in the simulations; this would be easy enough to check by extracting the mating success of males with different genotypes from the simulation output should be reported, e.g., as a figure supplement.

      Our new Figure S2, which depicts haplotype frequencies for a set of the simulations presented in Figure 4, should demonstrate a necessary presence of synergistic epistasis. These results further clarify that the weaker allele B is only kept when linked to A. The same fitness classes of genotype are present in the simulations with and without the inversion, so the only mechanical difference is the rate of recombination, and the only way this might change selection on the alleles is if a variant has a different fitness in one haplotype background than another – i.e. epistasis. The maintenance of haplotypes AB and ab to the exclusion of Ab and aB relies on the lesser relative fitness of Ab and aB. And since survival values are multiplicative, this additional contribution must come from the mate success of AB being disproportionately larger than Ab or aB, indicating the emergent synergistic epistasis posited by our model. We have clarified this point in the text at line 363ff.

      (3) l. 318ff: What was this set number of males? I could not find this information anywhere. Also, this model of the mating system is commonly referred to as "best of N", so the authors may want to include this label in the description.

      We indicate this detail just after the referenced line, now reworded and on l. 338-340 as “For each female’s mating competition, 100 males were sampled, though see Figure S1 for plots with varying encounter number.”  Among these edits, “one hundred” has been changed to a numeral for easier skimming, and Figure S1 is now referenced here earlier in the text. Several edits have also been made in the caption of Figures 2 and 3, and in the relevant methods section to clarify the number of encountered males simulated, mention best of N terminology, and clarify how the quality score is used in the mate competition.

      (4) The description of the experiment is still confusing. The number of individuals of each sex entered in each mating cage is missing from the Methods (l. 914); although I did finally find it in the Results. These flies were laying over 2 weeks - does this mean that offspring from the entire period were used to obtain the embryo and aged offspring frequencies, or only from a particular egg collection? If the former, does this mean that the offspring obtained from different egg batches were aged separately? Were the offspring aged in cages or bottles, at what density? Given that only those males that survived until the end of the two-week mating period were sequenced, it is important to know what % of the initial number of males these survivors were. A substantial mortality of the parental males could bias the estimate of parental frequencies. How many parental males, embryos and aged offspring were sequenced? Were all individuals of a given cage and stage extracted and sequenced as a single pool or were there multiple pools? The description could also be structured better. For example, the food and grape agar recipes and cage construction are inserted at random points of the description of the crossing design, which does not help.

      We have now reorganized and edited these portions of the Methods text. Portions of this comment overlap with edits responding to (2) of the Public Review and below for l. 921 in Details. Offspring from different laying periods were aged in different bottles, further separated by the time at which they eclosed. They were then pooled for DNA extraction and library preparation by sex and a binary early or late eclosion time. This data was present in the “D. mel. Sample Size” column of supplemental tables S6 and S7 (now S7 and S8), but we have added and referenced a new table to specifically collate the sample sizes of different experimental stages, table S1. Now referenced at lines 555, 638, 975, 1057.

      (5) The caption of figure 9 and the discussion of its results should be clear and explicit about the fact that "adult offspring" in Fig 9A and "female" and "male" refers to adults surviving to old age (whereas "parental" in Fig 9A refers to young adults in their reproductive prime. This has consequences for the interpretation of the difference between "parental" and "adult offspring", as it combines one generation of usual selection as it occurs under the conditions of the lab culture (young adult at generation t -> young adult in generation t+1) with an additional step of selection for longevity. Thus, a marked change in allele frequency does not imply that the "parental" frequency does not represent an equilibrium frequency of the inversions under the lab culture conditions. Furthermore, it would be useful to state explicitly that Figure 9B represents the same results as figure 9A, but with the aged offspring split by sex.

      Figure caption edited to provide further clarity on the age of cohorts and presented data, along with the relevant results section (2.3) referencing this figure.

      We avoid making any statements about the equilibrium frequencies of inversions under lab conditions, and whether or not any step of our experiment reflects such equilibria, because our investigation does not rely upon or test for such conditions. Instead, our analysis focuses on whether inversions have contrasting effects (as indicated by frequency changes that are incompatible with neutral sampling) between different life history components.  Under our model, such frequency reversals might be detectable both at equilibrium balanced inversion frequencies and also at frequencies some distance away from equilibria. We have now clarified this point at l. 970-972.

      Details:

      l. 211: this should be modified as male-only costs are now included.

      Edited. “survival likelihood (of either or both sexes).”

      l. 343: misplaced period

      Edited.

      l. 814: "We confirmed model predictions...": This sounds like it refers to an empirical confirmation of a theory prediction, but I think the authors just want to say that their simulations predicted antagonistic variants can be maintained at an intermediate equilibrium frequency. So the wording should be changed to avoid ambiguity.

      Edited. Now line 869.

      l. 853: How can a genome be "empty"? Do the authors mean an absence of any polymorphism?

      Edited to: “In SAIsim, a population is instantiated as a python object, and populated with individuals which are also represented by python objects. These individuals may be instantiated using genomes specified by the user, or by default carry no genomic variation.” Lines 913ff.

      l. 853: I do not see this diagramed in Figure 5

      Apologies, fixed to Fig. 2

      l. 864: is crossing-over in the model limited to female gametogenesis (reflecting the Drosophila case) or does it occur in both sexes?

      There is a variable in the simulator to make crossover female-specific. All simulations were performed with female-only crossover. Edited for clarity. “While the simulator can allow recombination in both sexes, all simulations presented only generate crossovers and gene conversion events for female gametes, in accordance with the biology of D. melanogaster.” Lines 928-929.

      l. 906: "F2" is ambiguous; does this mean that the mix of lines was allowed to breed for two generations? Also, in other places in the manuscript these flies appear to be referred to are "parental". So do not use F2.

      Edited, F2 language removed and replaced with being allowed to breed for two generations. Now lines 967ff.

      l. 910: this is incorrect/imprecise; what can be inferred is the frequency of the inversions in male gametes that contributed to fertilization. This would correspond to the frequency in successful males only if each successful male genotype had the same paternity share.

      Edited, now “Since no inversions could be inherited through the mothers, inversion frequencies among successful male gametes could be inferred from their pooled offspring.” Now line 994.

      l. 912: "without a controlled day/night cycle" meaning what? Constant light? Constant darkness? Daylight falling through the windows?

      Edited to “Unless otherwise noted, all flies were kept in a lab space of 23°C with around a degree of temperature fluctuation and without a controlled day/night cycle. Light exposure was dependent on the varying use of the space by laboratory workers but amounted to near constant exposure to at least a minimal level of lighting, with some variable light due to indirect lighting from adjacent rooms with exterior windows.” Now lines 1007-1010.

      l. 921: I cannot parse this sentence. Were the offspring isolated as virgins?

      No, the logistics of collecting virgins would have been prohibitive, and it did not seem essential for our experiment. Hopefully the edits to this section are clearer, now lines 978ff.

    1. making your PC the deepest node in the network infrastructure

      making your PC the deepest node in the network infrastructure, such that your computer serves as both client and server to present web pages to third parties

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      (1) As VRMate (a component of behaviorMate) is written using Unity, what is the main advantage of using behaviorMate/VRMate compared to using Unity alone paired with Arduinos (e.g. Campbell et al. 2018), or compared to using an existing toolbox to interface with Unity (e.g. Alsbury-Nealy et al. 2022, DOI: 10.3758/s13428-021-01664-9)? For instance, one disadvantage of using Unity alone is that it requires programming in C# to code the task logic. It was not entirely clear whether VRMate circumvents this disadvantage somehow -- does it allow customization of task logic and scenery in the GUI? Does VRMate add other features and/or usability compared to Unity alone? It would be helpful if the authors could expand on this topic briefly.

      We have updated the manuscript (lines 412-422) to clarify the benefits of separating the VR system as an isolated program and a UI that can be run independently. We argue that “…the recommended behaviorMate architecture has several important advantages. Firstly, by rendering each viewing angle of a scene on a dedicated device, performance is improved by splitting the computational costs across several inexpensive devices rather than requiring specialized or expensive graphics cards in order to run…, the overall system becomes more modular and easier to debug [and] implementing task logic in Unity would require understanding Object-Oriented Programming and C# … which is not always accessible to researchers that are typically more familiar with scripting in Python and Matlab.”

      VRMate receives detailed configuration info from behaviorMate at runtime as to which VR objects to display and receives position updates during experiments. Any other necessary information about triggering rewards or presenting non-VR cues is still handled by the UI so no editing of Unity is necessary. Scene configuration information is in the same JSON format as the settings files for behaviorMate, additionally there are Unity Editor scripts which are provided in the VRmate repository which permit customizing scenes through a “drag and drop” interface and then writing the scene configuration files programmatically. Users interested in these features should see our github page to find example scene.vr files and download the VRMate repository (including the editor scripts).  We provided 4 vr contexts, as well as a settings file that uses one of them which can be found on the behaviorMate github page (https://github.com/losonczylab/behaviorMate) in the “vr_contexts” and “example_settigs_files” directories. These examples are provided to assist VRMate users in getting set up and could provide a more detailed example of how VRMate and behaviorMate interact.

      (2) The section on "context lists", lines 163-186, seemed to describe an important component of the system, but this section was challenging to follow and readers may find the terminology confusing. Perhaps this section could benefit from an accompanying figure or flow chart, if these terms are important to understand.

      We maintain the use of the term context and context list in order to maintain a degree of parity with the java code. However, we have updated lines 173-175 to define the term context for the behaviorMate system: “... a context is grouping of one or more stimuli that get activated concurrently. For many experiments it is desirable to have multiple contexts that are triggered at various locations and times in order to construct distinct or novel environments.”

      a. Relatedly, "context" is used to refer to both when the animal enters a particular state in the task like a reward zone ("reward context", line 447) and also to describe a set of characteristics of an environment (Figure 3G), akin to how "context" is often used in the navigation literature. To avoid confusion, one possibility would be to use "environment" instead of "context" in Figure 3G, and/or consider using a word like "state" instead of "context" when referring to the activation of different stimuli.

      Thank you for the suggestion. We have updated Figure 3G to say “Environment” in order to avoid confusion.

      (3) Given the authors' goal of providing a system that is easily synchronizable with neural data acquisition, especially with 2-photon imaging, I wonder if they could expand on the following features:

      a. The authors mention that behaviorMate can send a TTL to trigger scanning on the 2P scope (line 202), which is a very useful feature. Can it also easily generate a TTL for each frame of the VR display and/or each sample of the animal's movement? Such TTLs can be critical for synchronizing the imaging with behavior and accounting for variability in the VR frame rate or sampling rate.

      Different experimental demands require varying levels of precision in this kind of synchronization signals. For this reason, we have opted against a “one-size fits all” for synchronization with physiology data in behaviorMate. Importantly this keeps the individual rig costs low which can be useful when constructing setups specifically for use when training animals. behaviorMate will log TTL pulses sent to GPIO pins setup as sensors, and can be configured to generate TTL pulses at regular intervals. Additionally all UDP packets received by the UI are time stamped and logged. We also include the output of the arduino millis() function in all UDP packets which can be used for further investigation of clock drift between system components. Importantly, since the system is event driven there cannot be accumulating drift across running experiments between the behaviorMate UI and networked components such as the VR system.

      For these reasons, we have not needed to implement a VR frame synchronization TTL for any of our experiments, however, one could extend VRMate to send "sync" packets back to behaviorMate to log when each frame was displayed precisely or TTL pulses (if using the same ODROID hardware we recommend in the standard setup for rendering scenes). This would be useful if it is important to account for slight changes in the frame rate at which the scenes are displayed. However, splitting rendering of large scenes between several devices results in fast update times and our testing and benchmarks indicate that display updates are smooth and continuous enough to appear coupled to movement updates from the behavioral apparatus and sufficient for engaging navigational circuits in the brain.

      b. Is there a limit to the number of I/O ports on the system? This might be worth explicitly mentioning.

      We have updated lines 219-220 in the manuscript to provide this information: Sensors and actuators can be connected to the controller using one of the 13 digital or 5 analog input/output connectors.

      c. In the VR version, if each display is run by a separate Android computer, is there any risk of clock drift between displays? Or is this circumvented by centralized control of the rendering onset via the "real-time computer"?

      This risk is mitigated by the real-time computer/UI sending position updates to the VR displays. The maximum amount scenes can be out of sync is limited because they will all recalibrate on every position update – which occurs multiple times per second as the animal is moving. Moreover, because position updates are constantly being sent by behaviorMate to VRMate and VRMate is immediately updating the scene according to this position, the most the scene can become out of sync with the mouse's position is proportional to the maximum latency multiplied by the running speed of the mouse. For experiments focusing on eliciting an experience of navigation, such a degree of asynchrony is almost always negligible. For other experimental demands it could be possible to incorporate more precise frame timing information but this was not necessary for our use case and likely for most other use cases. Additionally, refer to the response to comment 3a.

      Reviewer #2 (Public review):

      (1) The central controlling logic is coupled with GUI and an event loop, without a documented plugin system. It's not clear whether arbitrary code can be executed together with the GUI, hence it's not clear how much the functionality of the GUI can be easily extended without substantial change to the source code of the GUI. For example, if the user wants to perform custom real-time analysis on the behavior data (potentially for closed-loop stimulation), it's not clear how to easily incorporate the analysis into the main GUI/control program.

      Without any edits to the existing source code behaviorMate is highly customizable through the settings files, which allow users to combine the existing contexts and decorators in arbitrary combinations. Therefore, users have been able to perform a wide variety of 1D navigation tasks, well beyond our anticipated use cases by generating novel settings files. The typical method for providing closed-loop stimulation would be to set up a context which is triggered by animal behavior using decorators (e.g. based on position, lap number and time) and then trigger the stimulation with a TTL pulse. Rarely, if users require a behavioral condition not currently implemented or composable out of existing decorators, it would require generating custom code in Java to extend the UI. Performing such edits requires only knowledge of basic object-oriented programming in Java and generating a single subclass of either the BasicContextList or ContextListDecorator classes. In addition, the JavaFX (under development) version of behaviorMate incorporates a plugin which doesn't require recompiling the code in order to make these changes. However, since the JavaFX software is currently under development, documentation does not yet exist. All software is open-sourced and available on github.com for users interested in generating plugins or altering the source code.

      We have added the additional caveat to the manuscript in order to clarify this point (Line 197-202): “However, if the available set of decorators is not enough to implement the required task logic, some modifications to the source code may be necessary. These modifications, in most cases, would be very simple and only a basic understanding of object-oriented programming is required. A case where this might be needed would be performing novel customized real-time analysis on behavior data and activating a stimulus based on the result”

      (2) The JSON messaging protocol lacks API documentation. It's not clear what the exact syntax is, supported key/value pairs, and expected response/behavior of the JSON messages. Hence, it's not clear how to develop new hardware that can communicate with the behaviorMate system.

      The most common approach for adding novel hardware is to use TTL pulses (or accept an emitted TTL pulse to read sensor states). This type of hardware addition  is possible through the existing GPIO without the need to interact with the software or JSON API. Users looking to take advantage of the ability to set up and configure novel behavioral paradigms without the need to write any software would be limited to adding hardware which could be triggered with and report to the UI with a TTL pulse (however fairly complex actions could be triggered this way).

      For users looking to develop more customized hardware solutions that interact closely with the UI or GPIO board, additional documentation on the JSON messaging protocol has been added to the behaviormate-utils repository (https://github.com/losonczylab/behaviormate_utils). Additionally, we have added a link to this repository in the Supplemental Materials section (line 971) and referenced this in the manuscript (line 217) to make it easier for readers to find this information.

      Furthermore, developers looking to add completely novel components to the UI  can implement the interface described by Context.java in order to exchange custom messages with hardware. (described  in the JavaDoc: https://www.losonczylab.org/behaviorMate-1.0.0/)  These messages would be defined within the custom context and interact with the custom hardware (meaning the interested developer would make a novel addition to the messaging API). Additionally, it should be noted that without editing any software, any UDP packets sent to behaviorMate from an IP address specified in the settings will get time stamped and logged in the stored behavioral data file meaning that are a large variety of hardware implementation solutions using both standard UDP messaging and through TTL pulses that can work with behaviorMate with minimal effort. Finally, see response to R2.1 for a discussion of the JavaFX version of the behaviorMatee UI including plugin support.

      (3) It seems the existing control hardware and the JSON messaging only support GPIO/TTL types of input/output, which limits the applicability of the system to more complicated sensor/controller hardware. The authors mentioned that hardware like Arduino natively supports serial protocols like I2C or SPI, but it's not clear how they are handled and translated to JSON messages.

      We provide an implementation for an I2C-based capacitance lick detector which interested developers may wish to copy if support for novel I2C or SPI. Users with less development experience wishing to expand the hardware capabilities of  behaviorMatecould also develop adapters which can be triggered  on a TTL input/output. Additionally, more information about the JSON API and how messages are transmitted to the PC by the arduino is described in point (2) and the expanded online documentation.

      a. Additionally, because it's unclear how easy to incorporate arbitrary hardware with behaviorMate, the "Intranet of things" approach seems to lose attraction. Since currently, the manuscript focuses mainly on a specific set of hardware designed for a specific type of experiment, it's not clear what are the advantages of implementing communication over a local network as opposed to the typical connections using USB.

      As opposed to serial communication protocols as typical with USB, networking protocols seamlessly function based on asynchronous message passing. Messages may be routed internally (e.g. to a PCs localhost address, i.e. 0.0.0..0) or to a variety of external hardware (e.g. using IP addresses such as those in the range 192.168.1.2 - 192.168.1.254). Furthermore, network-based communication allows modules, such as VR, to be added easily. behavoirMate systems can be easily expanded using low-cost Ethernet switches and consume only a single network adapter on the PC (e.g. not limited by the number of physical USB ports). Furthermore, UDP message passing is implemented in almost all modern programming languages in a platform independent manner (meaning that the same software can run on OSX, Windows, and Linux). Lastly, as we have pointed out (Line 117) a variety of tools exist for inspecting network packets and debugging; meaning that it is possible to run behaviorMate with simulated hardware for testing and debugging.

      The IOT nature of behaviorMate means there is no requirement for novel hardware to be implemented  using an arduino,  since any system capable of  UDP communication can  be configured. For example, VRMate is usually run on Odroid C4s, however one could easily create a system using Raspberry Pis or even additional PCs. behaviorMate is agnostic to the format of the UDP messages, but packaging any data in the JSON format for consistency would be encouraged. If a new hardware is a sensor that has input requiring it to be time stamped and logged then all that is needed is to add the IP address and port information to the ‘controllers’ list in a behaviorMate settings file. If more complex interactions are needed with novel hardware than a custom implementation of ContextList.java may be required (see response to R2.2). However, the provided UdpComms.java class could be used to easily send/receive messages from custom Context.java subclasses.

      Solutions for highly customized hardware do require basic familiarity with object-oriented programming using the Java programming language. However, in our experience most behavioral experiments do not require these kinds of modifications. The majority of 1D navigation tasks, which behaviorMate is currently best suited to control, require touch/motion sensors, LEDs, speakers, or solenoid valves,  which are easily controlled by the existing GPIO implementation. It is unlikely that custom subclasses would even be needed.

      Reviewer #3 (Public review):

      (1) While using UDP for data transmission can enhance speed, it is thought that it lacks reliability. Are there error-checking mechanisms in place to ensure reliable communication, given its criticality alongside speed?

      The provided GPIO/behavior controller implementation sends acknowledgement packets in response to all incoming messages as well as start and stop messages for contexts and “valves”. In this way the UI can update to reflect both requested state changes as well as when they actually happen (although there is rarely a perceptible gap between these two states unless something is unplugged or not functioning). See Line 85 in the revised manuscript “acknowledgement packets are used to ensure reliable message delivery to and from connected hardware”.

      (2) Considering this year's price policy changes in Unity, could this impact the system's operations?

      VRMate is not affected by the recent changes in pricing structure of the Unity project.

      The existing compiled VRMate software does not need to be regenerated to update VR scenes, or implement new task logic (since this is handled by the behaviorMate GUI). Therefore, the VRMate program is robust to any future pricing changes or other restructuring of the Unity program and does not rely on continued support of Unity. Additionally, while the solution presented in VRMate has many benefits, a developer could easily adapt any open-source VR Maze project to receive the UDP-based position updates from behaviorMate or develop their own novel VR solutions.

      (3) Also, does the Arduino offer sufficient precision for ephys recording, particularly with a 10ms check?

      Electrophysiology recording hardware typically has additional I/O channels which can provide assistance with tracking behavior/synchronization at a high resolution. While behaviorMate could still be used to trigger reward valves, either the ephys hardware or some additional high-speed DAQ would be recommended to maintain accurately with high-speed physiology data. behaviorMate could still be set up as normal to provide closed and open-loop task control at behaviorally relevant timescales alongside a DAQ circuit recording events at a consistent temporal resolution. While this would increase the relative cost of the individual recording setup, identical rigs for training animals could still be configured without the DAQ circuit avoiding unnecessary cost and complexity.

      (4) Could you clarify the purpose of the Sync Pulse? In line 291, it suggests additional cues (potentially represented by the Sync Pulse) are needed to align the treadmill screens, which appear to be directed towards the Real-Time computer. Given that event alignment occurs in the GPIO, the connection of the Sync Pulse to the Real-Time Controller in Figure 1 seems confusing.

      A number of methods exist for synchronizing recording devices like microscopes or electrophysiology recordings with behaviorMate’s time-stamped logs of actuators and sensors. For example, the GPIO circuit can be configured to send sync triggers, or receive timing signals as input. Alternatively a dedicated circuit could record frame start signals and relay them to the PC to be logged independently of the GPIO (enabling a high-resolution post-hoc alignment of the time stamps). The optimal method to use varies based on the needs of the experiment. Our setups have a dedicated BNC output and specification in the settings file that sends a TTL pulse at the start of an experiment in order to trigger 2p imaging setups (see line 224, specifically that this is a detail of “our” 2p imaging setup). We provide this information as it might be useful suggesting how to have both behavior and physiology data start recording at the same time. We do not intend this to be the only solution for alignment. Figure 1 indicates an “optional” circuit for capturing a high speed sync pulse and providing time stamps back to the real time PC. This is another option that might be useful for certain setups (or especially for establishing benchmarks between behavior and physiology recordings). In our setup event alignment does not exclusively occur on the GPIO.

      a. Additionally, why is there a separate circuit for the treadmill that connects to the UI computer instead of the GPIO? It might be beneficial to elaborate on the rationale behind this decision in line 260.

      Event alignment does not occur on the GPIO, separating concerns between position tracking and more general input/output features which improves performance and simplifies debugging.  In this sense we maintain a single event loop on the Arduino, avoiding the need to either run multithreaded operations or rely extensively on interrupts which can cause unpredictable code execution (e.g. when multiple interrupts occur at the same time). Our position tracking circuit is therefore coupled to a separate,low-cost arduino mini which has the singular responsibility of position-tracking.

      b. Moreover, should scenarios involving pupil and body camera recordings connect to the Analog input in the PCB or the real-time computer for optimal data handling and processing?

      Pupil and body camera recordings would be independent data streams which can be recorded separately from behaviorMate. Aligning these forms of full motion video could require frame triggers which could be configured on the GPIO board using single TTL like outputs or by configuring a valve to be “pulsed” which is a provided type customization.

      We also note that a more advanced developer could easily leverage camera signals to provide closed loop control by writing an independent module that sends UDP packets to behavoirMate. For example a separate computer vision based position tracking module could be written in any preferred language and use UDP messaging to send body tracking updates to the UI without editing any of the behaviorMate source code (and even used for updating 1D location).

      (5) Given that all references, as far as I can see, come from the same lab, are there other labs capable of implementing this system at a similar optimal level?

      To date two additional labs have published using behaviorMate, the Soltez and Henn labs (see revised lines 341-342). Since behaviorMate has only recently been published and made available open source, only external collaborators of the Losonczy lab have had access to the software and design files needed to do this. These collaborators did, however, set up their own behavioral setups in separate locations with minimal direct support from the authors–similar to what would be available to anyone seeking to set a behaviorMate system would find online on our github page or by posting to the message board.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (4) To provide additional context for the significance of this work, additional citations would be helpful to demonstrate a ubiquitous need for a system like behaviorMate. This was most needed in the paragraph from lines 46-65, specifically for each sentence after line 55, where the authors discuss existing variants on head-fixed behavioral paradigms. For instance, for the clause "but olfactory and auditory stimuli have also been utilized at regular virtual distance intervals to enrich the experience with more salient cues", suggested citations include Radvansky & Dombeck 2018 (DOI: 10.1038/s41467-018-03262-4), Fischler-Ruiz et al. 2021 (DOI: 10.1016/j.neuron.2021.09.055).

      We thank the reviewer for the suggested missing citations and have updated the manuscript accordingly (see line 58).

      (5) In addition, it would also be helpful to clarify behaviorMate's implementation in other laboratories. On line 304 the authors mention "other labs" but the following list of citations is almost exclusively from the Losonczy lab. Perhaps the citations just need to be split across the sentence for clarity? E.g. "has been validated by our experimental paradigms" (citation set 1) "and successfully implemented in other labs as well" (citation set 2).

      We have split the citation set as suggested (see lines 338-342).

      Minor Comments:

      (6) In the paragraph starting line 153 and in Fig. 2, please clarify what is meant by "trial" vs. "experiment". In many navigational tasks, "trial" refers to an individual lap in the environment, but here "trial" seems to refer to the whole behavioral session (i.e. synonymous with "experiment"?).

      In our software implementation we had originally used “trial” to refer to an imaging session rather than experiment (and have made updates to start moving to the more conventional lexicon). To avoid confusion we have remove this use of “trial” throughout the manuscript and replaced with “experiment” whenever possible

      (7) This is very minor, but in Figure 3 and 4, I don't believe the gavage needle is actually shown in the image. This is likely to avoid clutter but might be confusing to some readers, so it may be helpful to have a small inset diagram showing how the needle would be mounted.

      We assessed the image both with and without the gavage needle and found the version in the original (without) to be easier to read and less cluttered and therefore maintained that version in the manuscript.

      (8) In Figure 5 legend, please list n for mice and cells.

      We have updated the Figure 5 legend to indicate that for panels C-G, n=6 mice (all mice were recorded in both VR and TM systems), 3253 cells in VR classified as significantly tuned place cells VR, and 6101 tuned cells in TM,

      (9) Line 414: It is not necessary to tilt the entire animal and running wheel as long as the head-bar clamp and objective can rotate to align the imaging window with the objective's plane of focus. Perhaps the authors can just clarify the availability of this option if users have a microscope with a rotatable objective/scan head.

      We have added the suggested caveat to the manuscript in order to clarify when the goniometers might be useful (see lines 281-288).

      (10) Figure S1 and S2 could be referenced explicitly in the main text with their related main figures.

      We have added explicit references to figures S1 and S2 in the relevant sections (see lines 443, 460  and 570)

      (11) On line 532-533, is there a citation for "proximal visual cues and tactile cues (which are speculated to be more salient than visual cues)"?

      We have added citations to both Knierim & Rao 2003 and Renaudineau et al. 2007 which discuss the differential impact of proximal vs distal cues during navigation as well as Sofroniew et al. 2014 which describe how mice navigate more naturally in a tactile VR setup as opposed to purely visual ones.

      (12) There is a typo at the end of the Figure 2 legend, where it should say "Arduino Mini."

      This typo has been fixed.

      Reviewer #2 (Recommendations For The Authors):

      (4) As mentioned in the public review: what is the major advantage of taking the IoT approaches as opposed to USB connections to the host computer, especially when behaviorMate relies on a central master computer regardless? The authors mentioned the readability of the JSON messages, making the system easier to debug. However, the flip side of that is the efficiency of data transmission. Although the bandwidth/latency is usually more than enough for transmitting data and commands for behavior devices, the efficiency may become a problem when neural recording devices (imaging or electrophysiology) need to be included in the system.

      behaviorMate is not intended to do everything, and is limited to mainly controlling behavior and providing some synchronizing TTL style triggers. In this way the system can easily and inexpensively be replicated across multiple recording setups; particularly this is useful for constructing additional animal training setups. The system is very much sufficient for capturing behavioral inputs at relevant timescales (see the benchmarks in Figures 3 and 4 as well as the position correlated neural activity in Figures 5 and 6 for demonstration of this). Additional hardware might be needed to align the behaviorMate output with neural data for example a high-speed DAQ or input channels on electrophysiology recording setups could be utilized (if provided). As all recording setups are different the ideal solution would depend on details which are hard to anticipate. We do not mean to convey that the full neural data would be transmitted to the behaviorMate system (especially using the JSON/UDP communications that behaviorMate relies on).

      (5) The author mentioned labView. A popular open-source alternative is bonsai (https://github.com/bonsai-rx/bonsai). Both include a graphical-based programming interface that allows the users to easily reconfigure the hardware system, which behaviorMate seems to lack. Additionally, autopilot (https://github.com/auto-pi-lot/autopilot) is a very relevant project that utilizes a local network for multiple behavior devices but focuses more on P2P communication and rigorously defines the API/schema/communication protocols for devices to be compatible. I think it's important to include a discussion on how behaviorMate compares to previous works like these, especially what new features behaviorMate introduces.

      We believe that behaviorMate provides a more opinionated and complete solution than the projects mentioned. A wide variety of 1D navigational paradigms can be constructed in behaviorMate without the need to write any novel software. For example, bonsai is a “visual programming language” and would require experimenters to construct a custom implementation of each of their experiments. We have opted to use Java for the UI with distributed computations across modules in various languages. Given the IOT methodology it would be possible to use any number of programming languages or APIs; a large number of design decisions were made  when building the project and we have opted to not include this level of detail in the manuscript in order to maintain readability. We strongly believe in using non-proprietary and open source projects, when possible, which is why the comparison with LabView based solutions was included in the introduction. Also, we have added a reference to the autopilot reference to the section of the introduction where this is discussed.

      (6) One of the reasons labView/bonsai are popular is they are inherently parallel and can simultaneously respond to events from different hardware sources. While the JSON events in behaviorMate are asynchronous in nature, the handling of those events seems to happen only in a main event loop coupled with GUI, which is sequential by nature. Is there any multi-threading/multi-processing capability of behaviorMate? If so it's an important feature to highlight. If not I think it's important to discuss the potential limitation of the current implementation.

      IOT solutions are inherently concurrent since the computation is distributed. Additional parallelism could be added by further distributing concerns between additional independent modules running on independent hardware. The UI has an eventloop which aggregates inputs and then updates contexts based on the current state of those inputs sequentially. This sort of a “snapshot” of the current state is necessary to reason about when the start certain contexts based on their settings and applied decorators. While the behaviorMate UI uses multithreading libraries in Java to be more performant in certain cases, the degree to which this represents true vs “virtual” concurrency would depend on the individual PC architecture it is run on and how the operating system allocates resources. For this reason, we have argued in the manuscript that behaviorMate is sufficient for controlling experiments at behaviorally relevant timescales, and have presented both benchmarks and discussed different synchronization approaches and permit users to determine if this is sufficient for their needs.

      (7) The context list is an interesting and innovative approach to abstract behavior contingencies into a data structure, but it's not currently discussed in depth. I think it's worth highlighting how the context list can be used to cover a wide range of common behavior experimental contingencies with detailed examples (line 185 might be a good example to give). It's also important to discuss the limitation, as currently the context lists seem to only support contingencies based purely on space and time, without support for more complicated behavior metrics (e.g. deliver reward only after X% correct).

      To access more complex behavior metrics during runtime, custom context list decorators would need to be implemented. While this is less common in the sort of 1D navigational behaviors the project was originally designed to control, adding novel decorators is a simple process that only requires basic object oriented programming knowledge. As discussed we are also implementing a plugin-architecture in the JavaFX update to streamline these types of additions.

      Minor Comments:

      (8) In line 202, the author suggests that a single TTL pulse is sent to mark the start of a recording session, and this is used to synchronize behavior data with imaging data later. In other words, there are no synchronization signals for every single sample/frame. This approach either assumes the behavior recording and imaging are running on the same clock or assumes evenly distributed recording samples over the whole recording period. Is this the case? If so, please include a discussion on limitations and alternative approaches supported by behaviorMate. If not, please clarify how exactly synchronization is done with one TTL pulse.

      While the TTL pulse triggers the start of neural data in our setups, various options exist for controlling for the described clock drift across experiments and the appropriate one depends on the type of recordings made, frame rate duration of recording etc. Therefore behaviorMate leaves open many options for synchronization at different time scales (e.g. the adding a frame-sync circuit as shown in Figure 1 or sending TTL pulses to the same DAQ recording electrophysiology data).  Expanded consideration of different synchronization methods has been included in the manuscript (see lines 224-238).

      (9) Is the computer vision-based calibration included as part of the GUI functionality? Please clarify. If it is part of the GUI, it's worth highlighting as a very useful feature.

      The computer vision-based benchmarking is not included in the GUI. It is in the form of a script made specifically for this paper. However for treadmill-based experiments behaviorMate has other calibration tools built into it (see line 301-303).

      (10) I went through the source code of the Arduino firmware, and it seems most "open X for Y duration" functions are implemented using the delay function. If this is indeed the case, it's generally a bad idea since delay completely pauses the execution and any events happening during the delay period may be missed. As an alternative, please consider approaches comparing timestamps or using interrupts.

      We have avoided the use of interrupts on the GPIO due to the potential for unpredictable code execution. There is a delay which is only just executed if the duration is 10 ms or less as we cannot guarantee precision of the arduino eventloop cycling faster than this. Durations longer than 10 ms would be time stamped and non-blocking. We have adjusted this MAX_WAIT to be specified as a macro so it can be more easily adjusted (or set to 0).

      (11) Figure 3 B, C, D, and Figure 4 D, E suffer from noticeable low resolution.

      We have converted Figure 3B, C, D and 4C, D, E to vector graphics in order to improve the resolution.

      (12) Figure 4C is missing, which is an important figure.

      This figure appeared when we rendered and submitted the manuscript. We apologize if the figure was generated such that it did not load properly in all pdf viewers. The panel appears correctly in the online eLife version of the manuscript. Additionally, we have checked the revision in Preview on Mac OS as well as Adobe Acrobat and the built-in viewer in Chrome and all figure panels appear in each so we hope this issue has been resolved.

      (13) There are thin white grid lines on all heatmaps. I don't think they are necessary.

      The grid lines have been removed from the heatmaps  as suggested.

      (14) Line 562 "sometimes devices directly communicate with each other for performance reasons", I didn't find any elaboration on the P2P communication in the main text. This is potentially worth highlighting as it's one of the advantages of taking the IoT approaches.

      In our implementation it was not necessary to rely on P2P communication beyond what is indicated in Figure 1. The direct communication referred to in line 562 is meant only to refer to the examples expanded on in the rest of the paragraph i.e. the behavior controller may signal the microscope directly using a TTL signal without looping back to the UI. As necessary users could implement UDP message passing between devices, but this is outside the scope of what we present in the manuscript.

      (15) Line 147 "Notably, due to the systems modular architecture, different UIs could be implemented in any programming language and swapped in without impacting the rest of the system.", this claim feels unsupported without a detailed discussion of how new code can be incorporated in the GUI (plugin system).

      This comment refers to the idea of implementing “different UIs”. This would entail users desiring to take advantage of the JSON messaging API and the proposed electronics while fully implementing their own interface. In order to facilitate this option we have improved documentation of the messaging API posted in the README file accompanying the arduino source code. We have added reference to the supplemental materials where readers can find a link to the JSON API implementation to clarify this point.

      Additionally, while a plugin system is available in the JavaFX version of behaviorMate, this project is currently under development and will update the online documentation as this project matures, but is unrelated to the intended claim about completely swapping out the UI.

      Reviewer #3 (Recommendations For The Authors):

      (6) Figure 1 - the terminology for each item is slightly different in the text and the figure. I think making the exact match can make it easier for the reader.

      - Real-time computer (figure) vs real-time controller (ln88).

      The manuscript was adjusted to match figure terminology.

      - The position controller (ln565) - position tracking (Figure).

      We have updated Figure 1 to highlight that the position controller does the position tracking.

      - Maybe add a Behavior Controller next to the GPIO box in Figure 1.

      We updated Figure 1 to highlight that the Behavior Controller performs the GPIO responsibility such that "Behavior Controller" and "GPIO circuit" may be used interchangeably.

      - Position tracking (fig) and position controller (subtitle - ln209).

      We updated Figure 1 to highlight that the position controller does the position tracking.

      - Sync Pulse is not explained in the text.

      The caption for Figure 1 has been updated to better explain the Sync pulse and additional systems boxes

      (7) For Figure 3B/C: What is the number of data points? It would be nice to see the real population, possibly using a swarm plot instead of box plots. How likely are these outliers to occur?

      In order to better characterize the distributions presented in our benchmarking data we have added mean and standard deviation information the plots 3 and 4. For Figure 3B: 0.0025 +/- 0.1128, Figure 3C: 12.9749 +/- 7.6581, Figure 4C: 66.0500 +/- 15.6994, Figure 4E: 4.1258 +/- 3.2558.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      We thank the reviewer for the time and effort in providing very useful comments and suggestions for our manuscript.

      (1) The results do not support the conclusions. The main "selling point" as summarized in the title is that the apoptotic rate of zebrafish motorneurons during development is strikingly low (~2% ) as compared to the much higher estimate (~50%) by previous studies in other systems. The results used to support the conclusion are that only a small percentage (under 2%) of apoptotic cells were found over a large population at a variety of stages 24-120hpf. This is fundamentally flawed logic, as a short-time window measure of percentage cannot represent the percentage in the long term. For example, at any year under 1% of the human population dies, but over 100 years >99% of the starting group will have died. To find the real percentage of motorneurons that died, the motorneurons born at different times must be tracked over the long term or the new motorneuron birth rate must be estimated. A similar argument can be applied to the macrophage results. Here the authors probably want to discuss well-established mechanisms of apoptotic neuron clearance such as by glia and microglia cells.

      We chose the time window of 24-120 hpf based on the following two reasons: 1) Previous studies showed that although the time windows of motor neuron death vary in chick (E5-E10), mouse (E11.5-E15.5), rat (E15-E18), and human (11-25 weeks of gestation), the common feature of these time windows is that they are all the developmental periods when motor neurons contact with muscle cells. The contact between zebrafish motor neurons and muscle cells occurs before 72 hpf, which is included in our observation time window of 24-120 hpf. 2) Zebrafish complete hatching during 48-72 hpf, and most organs form before 72 hpf. More importantly, zebrafish start swimming around 72 hpf, indicating that motor neurons are fully functional at 72 hpf. Thus, we are confident that this 24-120 hpf time window covers the time window during which motor neurons undergo programmed cell death during zebrafish early development. We have added this information to the revised manuscript.

      We frequently used “early development” in this manuscript to describe our observation. However, we missed “early” in our title. We therefore have added this ket word of “early” in the title in the revised manuscript.

      Previous studies in zebrafish have shown that the production of spinal cord motor neurons largely ceases before 48 hpf, and then the motor neurons remain largely constant until adulthood (doi: 10.1016/j.celrep.2015.09.050; 10.1016/j.devcel.2013.04.012; 10.1007/BF00304606; 10.3389/fcell.2021.640414). Our observation time window covers the major motor neuron production process. Therefore, we believe that neurogenesis will not affect our findings and conclusions.

      We discussed the engulfment of dead motor neurons by other types of cells in the discussion section.

      (2) The transgenic line is perhaps the most meaningful contribution to the field as the work stands. However, the mnx1 promoter is well known for its non-specific activation - while the images suggest the authors' line is good, motor neuron markers should be used to validate the line. This is especially important for assessing this population later as mnx1 may be turned off in mature neurons.

      The mnx1 promoter has been widely used to label motor neurons in transgenic zebrafish. Previous studies have shown that most of the cells labeled in the mnx1 transgenic zebrafish are motor neurons. In this study, we observed that the neuronal cells in our sensor zebrafish formed green cell bodies inside of the spinal cord and extended to the muscle region, which is an important morphological feature of the motor neurons.

      Reviewer 2:

      We thank the reviewer for the time and effort in making very useful comments and suggestions for our manuscript.

      The FRET-based programmed cell death biosensor described in this manuscript could be very useful. However, the authors have not considered what is already known about the development and programmed cell death of zebrafish spinal motor neurons, and potential differences between motor neuron populations innervating different types of muscles in different vertebrate models. Without this context, the application of their new biosensor tool does not provide new insights into zebrafish motor neuron programmed cell death. In addition, the authors have not carried out controls to show the efficacy and specificity of their morpholinos. Nor have they described how they counted dying motor neurons, or why they chose the specific developmental time points they addressed. These issues are addressed more specifically below.

      (1) Lines 12-13: Previous studies in zebrafish showed death of identified spinal motor neurons.

      Line 103: In Figure 2A the cell body in the middle is that of identified motor neuron VaP. VaP death has previously been described in several publications. The cell body on the right of the same panel appears to belong to an interneuron whose axon can be seen extending off to the left in one of the rostrocaudal axon bundles that traverse the spinal cord. Higher-resolution imaging would clarify this.

      Lines 163-164: Is this the absolute number of motor neurons that died? How were the counts done? Were all the motor neurons in every segment counted? There are approximately 30 identifiable VaP motor neurons in each embryo and they have previously been reported to die between 24-36 hpf. So this analysis is likely capturing those cells.

      Our study examined the overall motor neuron apoptosis rather than a specific type of motor neuron death, so we did not emphasize the death of VaP motor neurons. We agree that the dead motor neurons observed in our manuscript contain VaP motor neurons. However, there were also other types of dead motor neurons observed in our study. The reasons are as follows: 1) VaP primary motor neurons die before 36 hpf, but our study found motor neuron cells died after 36 hpf and even at 84 hpf (revised Figure 4A). 2) The position of the VaP motor neuron is together with that of the CaP motor neuron, that is, at the caudal region of the motor neuron cluster. Although it’s rare, we did observe the death of motor neurons in the rostral region of the motor neuron cluster (revised Figure 2C). 3) There is only one or zero VaP motor neuron in each motor neuron cluster. Although our data showed that usually one motor neuron died in each motor neuron cluster, we did observe that sometimes more than one motor neuron died in the motor neuron cluster (revised Figure 2C). We included this information in the revised discussion.

      (2) Lines 82-83: It is published that mnx1 is expressed in at least one type of spinal interneuron derived from the same embryonic domain as motor neurons.

      The mnx1 promoter has been widely used to label motor neurons in transgenic zebrafish. Previous studies have shown that most of the cells labeled in the mnx1 transgenic zebrafish are motor neurons. In this study, we observed that the neuronal cells in our sensor zebrafish formed green cell bodies inside of the spinal cord and extended to the muscle region, which is an important morphological feature of the motor neurons.

      Furthermore, a few of those green cell bodies turned into blue apoptotic bodies inside the spinal cord and changed to blue axons in the muscle regions at the same time, which strongly suggests that those apoptotic neurons are not interneurons. Although the mnx1 promoter might have labeled some interneurons, this will not affect our major finding that only a small portion of motor neurons died during zebrafish early development.

      (3) Lines 161-162: Although this may be the major time window of neurogenesis, there are many more motor neurons in adults than in larvae. Neither of these references describes the increase in motor neuron numbers over this particular time span, so the rationale for this choice is unclear.

      Lines 168-171: It is known that later developing motor neurons are still being generated in the spinal cord at this time, suggesting that if there is a period of programmed cell death similar to that described in chick and mouse, it would likely occur later. In addition, most of the chick and mouse studies were performed on limb-innervating motor neurons, rather than the body wall muscle-innervating motor neurons examined here.

      Lines 237-238: Especially since new motor neurons are still being generated at this time.

      Previous studies have shown that the production of spinal cord motor neurons largely ceases before 48 hpf in zebrafish, and then the motor neurons remain largely constant until the adulthood (doi: 10.1016/j.celrep.2015.09.050; 10.1016/j.devcel.2013.04.012; 10.1007/BF00304606; 10.3389/fcell.2021.640414). Our observation time window covers the major motor neuron production process. Therefore, we believe that neurogenesis will not affect our data and conclusions.

      The death of motor neurons in limb-innervating motor neurons has been extensively studied in chicks and rodents, as it is easy to undergo operations such as amputation. However, previous studies have shown this dramatic motor neuron death does not only occur in limb-innervating motor neurons but also occurs in other spinal cord motor neurons (doi: 10.1006/dbio.1999.9413). In our manuscript, we studied the naturally occurring motor neuron death in the whole spinal cord during the early stage of zebrafish development.

      (4) Lines 184-187: Previous publications showed that death of VaP is independent of limitations in muscle innervation area, suggesting it is not coupled to muscle-derived neurotrophic factors.

      Lines 328-334: There have been many publications describing appropriate morpholino controls. The authors need to describe their controls and show that they know that the genes they were targeting were downregulated.

      For the morpholinos, we did not confirm the downregulation of the target genes. These morpholino-related data are a minor part of our manuscript and shall not affect our major findings. We have removed the neurotrophic factors and morpholino-related data in the revised manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      The authors of this study set out to find RNA binding proteins in the CNS in cell-type specific sequencing data and discover that the cardiomyopathy-associated protein RBM20 is selectively expressed in olfactory bulb glutamatergic neurons and PV+ GABAergic neurons. They make an HA-tagged RBM20 allele to perform CLIP-seq to identify RBM20 binding sites and find direct targets of RBM20 in olfactory bulb glutmatergic neurons. In these neurons, RBM20 binds intronic regions. RBM20 has previously been implicated in splicing, but when they selectively knockout RBM20 in glutamatergic neurons they do not see changes in splicing, but they do see changes in RNA abundance, especially of long genes with many introns, which are enriched for synapse-associated functions. These data show that RBM20 has important functions in gene regulation in neurons, which was previously unknown, and they suggest it acts through a mechanism distinct from what has been studied before in cardiomyocytes.

      Strengths:

      The study finds expression of the cardiomyopathy-associated RNA binding protein RBM20 in specific neurons in the brain, opening new windows into its potential functions there.

      The study uses CLIP-seq to identify RBM20 binding RNAs in olfactory bulb neurons.

      Conditional knockout of RBM20 in glutamatergic or PV neurons allows the authors to detect mRNA expression that is regulated by RBM20.

      The data include substantial controls and quality control information to support the rigor of the findings.

      Weaknesses:

      The authors do not fully identify the mechanism by which RBM20 acts to regulate RNA expression in neurons, though they do provide data suggesting that neuronal RBM20 does not regulate alternate splicing in neurons, which is an interesting contrast to its proposed mechanism of function in cardiomyocytes. Discovery of the RNA regulatory functions of RBM20 in neurons is left as a question for future studies.

      The study does not identify functional consequences of the RNA changes in the conditional knockout cells, so this is also a question for the future.

    1. Reviewer #3 (Public review):

      Summary:

      The study demonstrates the effectiveness of a cost-effective closed-loop feedback system for modulating brain activity and behavior in head-fixed mice. Authors have tested real-time closed-loop feedback system in head-fixed mice two types of graded feedback: 1) Closed-loop neurofeedback (CLNF), where feedback is derived from neuronal activity (calcium imaging), and 2) Closed-loop movement feedback (CLMF), where feedback is based on observed body movement. It is a python based opensource system, and authors call it CLoPy. The authors also claim to provide all software, hardware schematics, and protocols to adapt it to various experimental scenarios. This system is capable and can be adapted for a wide use case scenario.

      Authors have shown that their system can control both positive (water drop) and negative reinforcement (buzzer-vibrator). This study also shows that using the close loop system mice have shown better performance, learnt arbitrary task and can adapt to change in the rule as well. By integrating real-time feedback based on cortical GCaMP imaging and behavior tracking authors have provided strong evidence that such closed-loop systems can be instrumental in exploring the dynamic interplay between brain activity and behavior.

      Strengths:

      Simplicity of feedback systems designed. Simplicity of implementation and potential adoption.

      Weaknesses:

      Long latencies, due to slow Ca2+ dynamics and slow imaging (15 FPS), may limit the application of the system.

      Major comments:

      (1) Page 5 paragraph 1: "We tested our CLNF system on Raspberry Pi for its compactness, general-purpose input/output (GPIO) programmability, and wide community support, while the CLMF system was tested on an Nvidia Jetson GPU device." Can these programs and hardware be integrated with windows-based system and a microcontroller (Arduino/ Tency). As for the broad adaptability that's what a lot of labs would already have (please comment/discuss)?

      (2) Hardware Constraints: The reliance on Raspberry Pi and Nvidia Jetson (is expensive) for real-time processing could introduce latency issues (~63 ms for CLNF and ~67 ms for CLMF). This latency might limit precision for faster or more complex behaviors, which authors should discuss in the discussion section.

      (3) Neurofeedback Specificity: The task focuses on mesoscale imaging and ignores finer spatiotemporal details. Sub-second events might be significant in more nuanced behaviors. Can this be discussed in the discussion section?

      (4) The activity over 6s is being averaged to determine if the threshold is being crossed before the reward is delivered. This is a rather long duration of time during which the mice may be exhibiting stereotyped behaviors that may result in the changes in DFF that are being observed. It would be interesting for the authors to compare (if data is available) the behavior of the mice in trials where they successfully crossed the threshold for reward delivery and in those trials where the threshold was not breached. How is this different from spontaneous behavior and behaviors exhibited when they are performing the test with CLNF?

    1. eLife Assessment

      This important study develops and exploits novel ideas in dendritic integration and implements these ideas in a neural network. Historically, dendritic plateau potentials were thought to exist primarily for maintaining neurons in a depolarized state for 100s of milliseconds, but this study presents a new perspective that dendritic plateau potentials are equally effective in much shorter integration windows. The computational evidence supporting the article's claims is compelling.

  7. Dec 2024
    1. Author response:

      eLife Assessment:

      In this important study, the authors combine innovative experimental approaches, including direct compressibility measurements and traction force analyses, with theoretical modeling to propose that wild-type cells exert compressive forces on softer HRasV12-transformed cells, influencing competition outcomes. The data generally provide solid evidence that transformed epithelial cells exhibit higher compressibility than wild-type cells, a property linked to their compaction during mechanical cell competition. However, the study would benefit from further characterization of how compression affects the behavior of HRasV12 cells and clearer causal links between compressibility and competition outcomes.

      We thank the reviewers and the editor for their thoughtful and encouraging feedback on our study and for appreciating the innovation in our experimental and theoretical approaches. We acknowledge the importance of further clarifying the mechanistic links between the compressibility of HRas<sup>V12</sup>-transformed cells, their compaction, and the outcomes of mechanical cell competition. In the revised manuscript, we will include additional experiments and analyses to assess how compression influences the cellular behavior and fate of HRas<sup>V12</sup>-transformed cells during competition. In addition, to strengthen the connection between collective compressibility and competition outcomes, we will integrate quantitative analyses of cell dynamics and additional modeling to explicitly correlate the mechanical properties with the spatial and temporal aspects of cell elimination. These additions will address the reviewer’s concerns comprehensively, further enriching the mechanistic understanding presented in the manuscript.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this article, Gupta and colleagues explore the parameters that could promote the elimination of active Ras cells when surrounded by WT cells. The elimination of active Ras cells by surrounding WT cells was previously described extensively and associated with a process named cell competition, a context dependant elimination of cells. Several mechanisms have been associated with competition, including more recently elimination processes based on mechanical stress. This was explored theoretically and experimentally and was either associated with differential growth and sensitivity to pressure and/or differences in homeostatic density/pressure. This was extensively validated for the case of Scribble mutant cells which are eliminated by WT MDCK cells due to their higher homeostatic density. However, there has been so far very little systematic characterisation of the mechanical parameters and properties of these different cell types and how this could contribute to mechanical competition.

      Here, the authors used the context of active Ras cells in MDCK cells (with some observations in vivo in mice gut which are a bit more anecdotal) to explore the parameters causal to Ras cell elimination. Using for the first time traction force microscopy, stress microscopy combined with Bayesian inference, they first show that clusters of active Ras cells experience higher pressure compared to WT. Interestingly, this occurs in absence of differences in growth rate, and while Ras cells seems to have lower homeostatic density, in contractions with the previous models associated with mechanical cell competition. Using a self-propelled Voronoi model, they explored more systematically the conditions that will promote the compression of transformed cells, showing globally that higher Area compressibility and/or lower junctional tension are associated with higher compressibility. Using then an original and novel experimental method to measure bulk compressibility of cell populations, they confirmed that active Ras cells are globally twice more compressible than WT cells. This compressibility correlates with a disruption of adherens junctions. Accordingly, the higher pressure near transformed Ras cells can be completely rescued by increasing cell-cell adhesion through E-cad overexpression, which also reduces the compressibility of the transformed cells. Altogether, these results go along the lines of a previous theoretical work (Gradeci et al. eLife 2021) which was suggesting that reduced stiffness/higher compressibility was essential to promote loser cell elimination. Here, the authors provide for the first time a very convincing experimental measurement and validation of this prediction. Moreover, their modelling approach goes far beyond what was performed before in terms of exploration of conditions promoting compressibility, and their experimental data point at alternative mechanisms that may contribute to mechanical competition.

      Strengths:

      - Original methodologies to perform systematic characterisation of mechanical properties of Ras cells during cell competition, which include a novel method to measure bulk compressibility.<br /> - A very extensive theoretical exploration of the parameters promoting cell compaction in the context of competition.

      We thank the reviewer for their detailed and thoughtful assessment of our study and for recognizing the originality of our methodologies, including the novel bulk compressibility measurement technique and the extensive theoretical exploration of parameters influencing mechanical competition. We are pleased that the reviewer finds our experimental validation and modeling approach convincing and acknowledges the relevance of our findings in advancing the understanding of mechanical cell competition. We will carefully address all the points raised to further clarify and strengthen the manuscript.

      Weaknesses:

      - Most of the theoretical focus is centred on the bulk compressibility, but so far does not really explain the final fate of the transformed cells. Classic cell competition scenario (including the one involving active Ras cells) lead to the elimination of one cell population either by cell extrusion/cell death or global delamination. This aspect is absolutely not explored in this article, experimentally or theoretically, and as such it is difficult to connect all the observables with the final outcome of cell competition. For instance, higher compressibility may not lead to loser status if the cells can withstand high density without extruding compared to the WT cells (and could even completely invert the final outcome of the competition). Down the line, and as suggested in most of the previous models/experiments, the relationship between pressure/density and extrusion/death will be the key factor that determine the final outcome of competition. However, there is absolutely no characterisation of cell death/cell extrusion in the article so far.

      We thank the reviewer for highlighting this important point. We agree that understanding the relationship between pressure, density, and the final outcomes of cell competition, such as extrusion and cell death, is crucial to connecting the mechanical properties to competition outcomes. While extrusion and cell death have been extensively characterized in previous works (e.g., https://www.nature.com/articles/s41467-021-27896-z; https://www.nature.com/articles/ncb1853), we nevertheless recognize the need to address this aspect more explicitly in our study. To this end, we have indeed performed experiments to characterize cell extrusion and cell death under varying conditions of pressure and density. We will incorporate these data into the revised manuscript. These additions will provide a more comprehensive understanding of how mechanical imbalance drives cell competition and determine the final fate of transformed cells.

      - While the compressibility measurement are very original and interesting, this bulk measurement could be explained by very different cellular processes, from modulation of cell shape, to cell extrusion and tissue multilayering (which by the way was already observed for active Ras cells, see for instance https://pubmed.ncbi.nlm.nih.gov/34644109/). This could change a lot the interpretation of this measurement and to which extend it can explain the compression observed in mixed culture. This compressibility measurement could be much more informative if coupled with an estimation of the change of cell aspect ratio and the rough evaluation of the contribution of cell shape changes versus alternative mechanisms.

      We thank the reviewer for raising this important concern. In our model system and within the experimental timescale of our studies involving gel compression microscopy (GCM) experiments, we do not observe tissue multilayering and cell extrusion, as these measurements are performed on homogeneous populations (pure wild-type or pure transformed cell monolayer). However, to address the reviewer’s suggestion, we will include measurements of cell aspect ratio as well as images eliminating the possibility of multilayering/extrusion in the revised manuscript. These results will provide additional insights into the plausible contributions of cell shape changes. Furthermore, our newer results indicate that the compressibility differences arise from variations in the intracellular organization (changed in nuclear and cytoskeletal organization) between wild-type and transformed cells. While a detailed molecular characterization of these underlying mechanisms is beyond the scope of the current manuscript, we acknowledge its importance and plan to explore it in a future study. These revisions will clarify and strengthen the interpretation of our findings.

      - So far, there is no clear explanation of why transformed Ras cells get more compacted in the context of mixed culture compared to pure Ras culture. Previously, the compaction of mutant Scribble cells could be explained by the higher homeostatic density of WT cells which impose their prefered higher density to Scribble mutant (see Wagstaff et al. 2016 or Gradeci et al 2021), however that is not the case of the Ras cells (which have even slightly higher density at confluency). If I understood properly, the Voronoid model assumes some directional movement of WT cell toward transformed which will actively compact the Ras cells through self-propelled forces (see supplementary methods), but this is never clearly discussed/described in the results section, while potentially being one essential ingredient for observing compaction of transformed cells. In fact, this was already described experimentally in the case of Scribble competition and associated with chemoattractant secretion from the mutant cells promoting directed migration of the WT (https://pubmed.ncbi.nlm.nih.gov/33357449/). It would be essential to show what happens in absence of directional propelled movement in the model and validate experimentally whether there is indeed directional movement of the WT toward the transformed cells. Without this, the current data does not really explain the competition process.

      We introduced directional movement of wild-type cells towards neighbouring transformed cells (and a form of active force to be exerted by them), motivated by the tissue compressibility measurements from the Gel Compression Microscopy experiments (Fig. 4E-L). This allowed us to devise an equivalent method of measuring the material response to isotropic compression within the SPV model framework. While the role of directional propelled movement is an area of ongoing investigation and has not been explored extensively within the current study, we emphasize that even without directional propulsion in the model, our results demonstrate compressive stress or elevated pressure, and increased compaction within the transformed population under suitable conditions reported in this work (when k<1), exhibiting a greater tissue-level compressibility in the transformed cells compared to WT cells (Figs. 4C-D), thereby laying the ground for competition. To clarify these concerns, we will provide additional results as well as detailed discussions on the effect of cell movements in compression.

      - Some of the data lack a bit of information on statistic, especially for all the stress microscopy and traction forces where we do no really know how representative at the stress patterns (how many experiment, are they average of several movies ? integrated on which temporal window ?)

      We thank the reviewer for highlighting the need for additional details regarding the statistical representation of our stress microscopy and traction force data. We will address these concerns in the revised manuscript by providing clear descriptions of the number of experiments, the averaging methodology, and the temporal windows used for analysis. Currently, Figs. 2A and 2C represent data from single time points, as the traction and stress landscapes evolve dynamically as transformed cells begin extruding (as shown in Supplementary movie 1). In contrast, Fig. 2H represents data collected from several samples across three independent experiments, all measured at the 3-hour time point following doxycycline induction. This specific time point is critical because it captures the emergence of compressive stresses before extrusion begins, simplifying the analysis and ensuring consistency. We will ensure these details are clearly articulated in the revised text and figure legends.

      Reviewer #2 (Public review):

      The work by Gupta et al. addresses the role of tissue compressibility as a driver of cell competition. The authors use a planar epithelial monolayer system to study cell competition between wild type and transformed epithelial cells expressing HRasV12. They combine imaging and traction force measurements from which the authors propose that wild type cells generate compressive forces on transformed epithelial cells. The authors further present a novel setup to directly measure the compressibility of adherent epithelial tissues. These measurements suggest a higher compressibility of transformed epithelial cells, which is causally linked to a reduction in cell-cell adhesion in transformed cells. The authors support their conclusions by theoretical modelling using a self-Propelled Voronoi model that supports differences in tissue compressibility can lead to compression of the softer tissue type.

      The experimental framework to measure tissue compressibility of adherent epithelial monolayers establishes a novel tool, however additional controls of this measurement appear required. Moreover, the experimental support of this study is mostly based on single representative images and would greatly benefit from additional data and their quantitative analysis to support the authors' conclusions. Specific comments are also listed in the following:

      Major points:

      It is not evident in Fig2A that traction forces increase along the interface between wild type and transformed populations and stresses in Fig2C also seem to be similar at the interface and surrounding cell layer. Only representative examples are provided and a quantification of sigma_m needs to be provided.

      In Figure 1-3 only panel 2G and 2H provide a quantitative analysis, but it is not clear how many regions of interest and clusters of transform cells were quantified.

      We thank the reviewer for their detailed comments and for highlighting the importance of additional quantitative analyses to support our conclusions. We appreciate their recognition of our novel experimental framework to measure tissue compressibility and the overall approach of our study. Regarding Fig. 2A and Fig. 2C, we acknowledge the need for further clarity. While the traction forces and stress patterns may not appear uniformly distinct at the interface in the representative images, these differences are more evident at specific time points before extrusion begins. Please note that the traction and stress landscapes evolve dynamically as transformed cells begin extruding (as shown in Supplementary movie 1). We will include a quantification of σ<sub>m</sub>​ and additional data from multiple experiments to substantiate the observations and address this concern in the revised manuscript. Currently, the data in Fig. 2G and Fig. 2H represent several regions of interest and transformed cell clusters collected from three independent experiments, all analyzed at the 3-hour time point after doxycycline induction. This time point was chosen because it captures the compressive stress emergence without interference from extrusion processes, simplifying the analysis. We will expand these sections with detailed descriptions of the sample sizes and statistical analyses to ensure greater transparency and reproducibility. These revisions will provide a stronger quantitative foundation for our findings and address the reviewer's concerns.

      Several statements appear to be not sufficiently justified and supported by data.<br /> For example the statement on pg 3. line 38 seems to lack supportive data 'This comparison revealed that the thickness of HRasV12-expressing cells was reduced by more than 1.7-fold when they were surrounded by wild type cells. These observations pointed towards a selective, competition-dependent compaction of HRasV12-expressing transformed cells but not control cells, in the intestinal villi of mice.'  Similarly, the statement about a cell area change of 2.7 fold (pg 3 line 47) lacks support by measurements.

      We thank the reviewer for pointing out the need for more supportive data to justify several statements in the manuscript. Specifically, the observation regarding the reduction in the thickness of HRas<sup>V12</sup>-expressing cells by more than 1.7-fold when surrounded by wild-type cells, and the statement about a 2.7-fold change in cell area, will be supported by detailed measurements. In the revised manuscript, we will include quantitative analyses with additional figures that clearly document these changes. These figures will provide representative images, statistical summaries, and detailed descriptions of the measurements to substantiate these claims. We appreciate the reviewer highlighting these areas and will ensure that all statements are robustly backed by data.

      What is the rationale for setting 𝐾p = 1 in the model assumptions if clear differences in junctional membranes of transformed versus wild type cells occur, including dynamic ruffling? This assumption does not seem to be in line with biological observations.

      While the specific role of K<sub>p</sub> in the differences observed in the junctional membranes of transformed versus WT cells, including dynamical ruffling, is not directly studied in this work, our findings indicate that the lower junctional tension (weaker and less stable cellular junctions) in mutant cells is influenced primarily by competition in the dimensionless cell shape index within the model. This also suggests a larger preferred cell perimeter (P<sub>0</sub>) for mutant cells, corresponding to their softer, unjammed state. Huang et al. (https://doi.org/10.1039/d3sm00327b) have previously argued that a high P<sub>0</sub> may, in some cases, result from elevated cortical tension along cell edges, or reflect weak membrane elasticity, implying a smaller K<sub>p</sub>. While this connection could be an intriguing avenue for future exploration, we emphasize that K<sub>p</sub> is not expected to alter any of the key findings or conclusions reported in this work. We will include any required analysis and corresponding discussions in the revised manuscript.

      The novel approach to measure tissue compressibility is based on pH dependent hydrogels. As the pH responsive hydrogel pillar is placed into a culture medium with different conditions, an important control would be if the insertion of this hydrogel itself would change the pH or conditions of the culture assays and whether this alters tissue compressibility or cell adhesion. The authors could for example insert a hydrogel pillar of a smaller diameter that would not lead to compression or culture cells in a larger ring to assess the influence of the pillar itself.

      We appreciate the reviewer’s insightful comment regarding the potential effects of the pH-responsive hydrogel pillar on the culture conditions and tissue compressibility. In our experiments, the expandable hydrogels are kept separate from the cells until the pH of the hydrogel is elevated to 7.4, ensuring that the hydrogel does not impact the culture environment. However, we acknowledge the concern and will include additional controls in the revised manuscript. Specifically, we will insert a hydrogel pillar with a smaller diameter that would not induce compression on culture cells in a larger ring to assess any potential influence of the hydrogel pillar itself. This will help to further validate our experimental setup.

      The authors focus on the study of cell compaction of the transformed cells, but how does this ultimately lead to a competitive benefit of wild type cells? Is a higher rate of extrusion observed and associated with the compaction of transformed cells or is their cell death rate increased? While transformed cells seem to maintain a proliferative advantage it is not clear which consequences of tissue compression ultimately drive cell competition between wild type and transformed cells.

      We thank the reviewer for highlighting this important point. We agree that understanding how tissue compression leads to a competitive advantage for wild type cells is crucial. While our current study focuses on the mechanical properties of transformed cells leading to the compaction and subsequent extrusion of the transformed cells, we recognize the need to explicitly connect these properties to the final outcomes of cell competition, such as extrusion or cell death. Although extrusion and cell death have been extensively characterized in previous studies (e.g., https://www.nature.com/articles/s41467-021-27896-z; https://www.nature.com/articles/ncb1853), we have indeed performed additional experiments to investigate the relationship between pressure, density, and these processes in our system. In the revised manuscript, we will include these new data, which will help to clarify how mechanical stress, driven by tissue compression, contributes to the competition between wild type and transformed cells and influences their eventual fate.

      The argumentation that softer tissues would be more easily compressed is plausible. However, which mechanism do the authors suggest is generating the actual compressive stress to drive the compaction of transformed cells? They exclude a proliferative advantage of wild type cells, which other mechanisms will generate the compressive forces by wild type cells?

      We thank the reviewer for raising this important question. As rightly pointed out by the reviewer indeed in our model system, we do not observe a proliferative advantage for the wild-type cells, and the compressive forces exerted by the wild-type cells are due to their intrinsic mechanical properties, such as lesser compressibility compared to the transformed cells. This difference in compressibility results in wild-type cells generating compressive stress at the interface with the transformed cells. Regarding the mechanism underlying the increased compressibility of the transformed cells, our newer findings indicate that the differences in compressibility arise from variations in the intracellular organization, specifically changes in nuclear and cytoskeletal organization between wild-type and transformed cells. While a detailed molecular characterization of these mechanisms is beyond the scope of the current manuscript, we acknowledge its significance and plan to investigate it in future work. We will, nevertheless, include a detailed discussion on the mechanism underlying the differential compressibility of wild-type and transformed cells in the revised manuscript.

    1. Enable Portable mode Windows, Linux After unzipping the VS Code download, create a data folder within VS Code's folder:

      enable portable mode

    1. Use CNG Engine You can set the script to use the OpenSSL Cryptographic Next Generation (CNG) Engine when running the script, instead of the default CAPI engine. This provides support for the CNG private key when processing certificates in Windows Certificate Store. To set CNG Engine user, add the following to the Web section of the default.cfg file: OpensslEngineType=1 To use the CNG Engine when recording the script, see HTTP properties > Advanced recording optio

      Remove this section. Now we have RTS UI to enable CNG replay.

    1. Reviewer #3 (Public review):

      Summary:

      In their study, McDermott et al. investigate the neurocomputational mechanism underlying sensory prediction errors. They contrast two accounts: representational sharpening and dampening. Representational sharpening suggests that predictions increase the fidelity of the neural representations of expected inputs, while representational dampening suggests the opposite (decreased fidelity for expected stimuli). The authors performed decoding analyses on EEG data, showing that first expected stimuli could be better decoded (sharpening), followed by a reversal during later response windows where unexpected inputs could be better decoded (dampening). These results are interpreted in the context of opposing process theory (OPT), which suggests that such a reversal would support perception to be both veridical (i.e., initial sharpening to increase the accuracy of perception) and informative (i.e., later dampening to highlight surprising, but informative inputs).

      Strengths:

      The topic of the present study is of significant relevance to the field of predictive processing. The experimental paradigm used by McDermott et al. is well designed, allowing the authors to avoid several common confounds in investigating predictions, such as stimulus familiarity and adaptation. The introduction of the manuscript provides a well-written summary of the main arguments for the two accounts of interest (sharpening and dampening), as well as OPT. Overall, the manuscript serves as a good overview of the current state of the field.

      Weaknesses:

      In my opinion, several details of the methods, results, and manuscript raise doubts about the quality and reliability of the reported findings. Key concerns are:

      (1) The results in Figure 2C seem to show that the leading image itself can only be decoded with ~33% accuracy (25% chance; i.e. ~8% above chance decoding). In contrast, Figure 2E suggests the prediction (surprisingly, valid or invalid) during the leading image presentation can be decoded with ~62% accuracy (50% chance; i.e. ~12% above chance decoding). Unless I am misinterpreting the analyses, it seems implausible to me that a prediction, but not actually shown image, can be better decoded using EEG than an image that is presented on-screen.

      (2) The "prediction decoding" analysis is described by the authors as "decoding the predictable trailing images based on the leading images". How this was done is however unclear to me. For each leading image decoding the predictable trailing images should be equivalent to decoding validity (as there were only 2 possible trailing image categories: 1 valid, 1 invalid). How is it then possible that the analysis is performed separately for valid and invalid trials? If the authors simply decode which leading image category was shown, but combine L1+L2 and L4+L5 into one class respectively, the resulting decoder would in my opinion not decode prediction, but instead dissociate the representation of L1+L2 from L4+L5, which may also explain why the time-course of the prediction peaks during the leading image stimulus-response, which is rather different compared to previous studies decoding predictions (e.g. Kok et al. 2017). Instead for the prediction analysis to be informative about the prediction, the decoder ought to decode the representation of the trailing image during the leading image and inter-stimulus interval. Therefore I am at present not convinced that the utilized analysis approach is informative about predictions.

      (3) I may be misunderstanding the reported statistics or analyses, but it seems unlikely that >10 of the reported contrasts have the exact same statistic of Tmax= 2.76. Similarly, it seems implausible, based on visual inspection of Figure 2, that the Tmax for the invalid condition decoding (reported as Tmax = 14.903) is substantially larger than for the valid condition decoding (reported as Tmax = 2.76), even though the valid condition appears to have superior peak decoding performance. Combined these details may raise concerns about the reliability of the reported statistics.

      (4) The reported analyses and results do not seem to support the conclusion of early learning resulting in dampening and later stages in sharpening. Specifically, the authors appear to base this conclusion on the absence of a decoding effect in some time-bins, while in my opinion a contrast between time-bins, showing a difference in decoding accuracy, is required. Or better yet, a non-zero slope of decoding accuracy over time should be shown (not contingent on post-hoc and seemingly arbitrary binning).

      (5) The present results both within and across trials are difficult to reconcile with previous studies using MEG (Kok et al., 2017; Han et al., 2019), single-unit and multi-unit recordings (Kumar et al., 2017; Meyer & Olson 2011), as well as fMRI (Richter et al., 2018), which investigated similar questions but yielded different results; i.e., no reversal within or across trials, as well as dampening effects with after more training. The authors do not provide a convincing explanation as to why their results should differ from previous studies, arguably further compounding doubts about the present results raised by the methods and results concerns noted above.

      Impact:

      At present, I find the potential impact of the study by McDermott et al. difficult to assess, given the concerns mentioned above. Should the authors convincingly answer these concerns, the study could provide meaningful insights into the mechanisms underlying perceptual prediction. However, at present, I am not entirely convinced by the quality and reliability of the results and manuscript. Moreover, the difficulty in reconciling some of the present results with previous studies highlights the need for more convincing explanations of these discrepancies and a stronger discussion of the present results in the context of the literature.

    1. HP Borringar: Providing operational data, validating feasibility, and giving field based feedback on all steps.

      Add between text above and list so that the list deosn't collapse.

      I think we need to have more partners than HP for the project to be accepted (otherwise, other comapnies will be reticent to grant the project, especially if it gives a competitive advantage to HP).

    2. pload and integrate drilling data for on-site modeling and decision support.

      We want the tool to say how the machine should be set on site for achieving the sought-after design (?)

    3. to collect operational data from ongoing/past projects, including richer datasets where feasible. The goal is to get digitalized data.

      A bit unclear for me, how is the data to be collected (App or semi-manually?)

    4. Collaborate with drilling companies to assess current data collection practices and the feasibility of gathering additional operational data (e.g., depth vs. time, torque, penetration rates).

      Interviews?

    5. These deviations introduce risks, including: - Uncertainty on performance due to unexpected spacing or interactions between boreholes. - Legal disputes from crossing property boundaries. - Increased costs and delays due to intersecting boreholes or unanticipated geological conditions.

      Nice points here, we could try to get the feedback from drillers as well

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, Zhang et al., presented an electrophysiology method to identify the layers of macaque visual cortex with high density Neuropixels 1.0 electrode. They found several electrophysiology signal profiles for high-resolution laminar discrimination and described a set of signal metrics for fine cortical layer identification.

      Strengths:

      There are two major strengths. One is the use of high density electrodes. The Neuropixels 1.0 probe has 20 um spacing electrodes, which can provide high resolution for cortical laminar identification. The second strength is the analysis. They found multiple electrophysiology signal profiles which can be used for laminar discrimination. Using this new method, they could identify the most thin layer in macaque V1. The data support their conclusion.

      Weaknesses:

      While this electrophysiology strategy is much easier to perform even in awake animals compared to histological staining methods, it provides an indirect estimation of cortical layers. A parallel histological study can provide a direct matching between the electrode signal features and cortical laminar locations. However, there are technical challenges, for example the distortions in both electrode penetration and tissue preparation may prevent a precise matching between electrode locations and cortical layers. In this case, additional micro wires electrodes binding with Neuropixels probe can be used to inject current and mark the locations of different depths in cortical tissue after recording.

      While we agree that it would be helpful to adopt a more direct method for linking laminar changes observed with electrophysiology to anatomical layers observed in postmortem histology, we do not believe that the approach suggested by the reviewer would be particularly helpful. The approach suggested involves making lesions, which are known to be quite variable in size, asymmetric in shape, and do not have a predictable geometry relative to the location of the electrode tip. In contrast, our electrophysiology measures have identified clear boundaries which precisely match the known widths and relative positions of all the layers of V1, including layer 4A, which is only 50 microns thick, much smaller than the resolution of lesion methods.

      Reviewer #2 (Public Review):

      Summary:

      This paper documents an attempt to accurately determine the locations and boundaries of the anatomically and functionally defined layers in macaque primary visual cortex using voltage signals recorded from a high-density electrode array that spans the full depth of cortex with contacts at 20 um spacing. First, the authors attempt to use current source density (CSD) analysis to determine layer locations, but they report a striking failure because the results vary greatly from one electrode penetration to the next and because the spatial resolution of the underlying local field potential (LFP) signal is coarse compared to the electrical contact spacing. The authors thus turn to examining higher frequency signals related to action potentials and provide evidence that these signals reflect changes in neuronal size and packing density, response latency and visual selectivity.

      Strengths:

      There is a lot of nice data to look at in this paper that shows interesting quantities as a function of depth in V1. Bringing all of these together offers the reader a rich data set: CSD, action potential shape, response power and coherence spectrum, and post-stimulus time response traces. Furthermore, data are displayed as a function of eye (dominant or non-dominant) and for achromatic and cone-isolating stimuli.

      This paper takes a strong stand in pointing out weaknesses in the ability of CSD analysis to make consistent determinations about cortical layering in V1. Many researchers have found CSD to be problematic, and the observations here may be important to motivate other researchers to carry out rigorous comparisons and publish their results, even if they reflect negatively on the value of CSD analysis.

      The paper provides a thoughtful, practical and comprehensive recipe for assigning traditional cortical layers based on easily-computed metrics from electrophysiological recordings in V1, and this is likely to be useful for electrophysiologists who are now more frequently using high-density electrode arrays.

      Weaknesses:

      Much effort is spent pointing out features that are well known, for example, the latency difference associated with different retinogeniculate pathways, the activity level differences associated with input layers, and the action potential shape differences associated with white vs. gray matter. These have been used for decades as indicators of depth and location of recordings in visual cortex as electrodes were carefully advanced. High density electrodes allow this type of data to now be collected in parallel, but at discrete, regular sampling points. Rather than showing examples of what is already accepted, the emphasis should be placed on developing a rigorous analysis of how variable vs. reproducible are quantitative metrics of these features across penetrations, as a function of distance or functional domain, and from animal to animal. Ultimately, a more quantitative approach to the question of consistency is needed to assess the value of the methods proposed here.

      We thank the reviewer for suggesting the addition of quantitative metrics to allow more substantive comparisons between various measures within and between penetrations. We have added quantification and describe this in the context of more specific comments made by this reviewer. We have retained descriptions of metrics that are well established because they provide an important validation of our approaches and laminar assignments.

      Another important piece of information for assessing the ability to determine layers from spiking activity is to carry out post-mortem histological processing so that the layer determination made in this paper could be compared to anatomical layering.

      We are not aware of any approach that would provide such information at sufficient resolution. For example, it is well known that electrolytic lesions often do not match to the locations expected from electrophysiological changes observed with single electrodes. As noted above, our observation that the laminar changes in electrophysiology precisely match the known widths and relative positions of all the layers of V1, including layer 4A, provides confidence in our laminar assignments.

      On line 162, the text states that there is a clear lack of consistency across penetrations, but why should there be consistency: how far apart in the cortex were the penetrations? How long were the electrodes allowed to settle before recording, how much damage was done to tissue during insertion? Do you have data taken over time - how consistent is the pattern across several hours, and how long was the time between the collection of the penetrations shown here?

      Answers to most of these questions can be found within the manuscript text. We have added text describing distance between electrode penetrations (at least 1mm, typically far more) and added a figure which shows a map of the penetration locations. The Methods section describes electrode penetration methods to minimize damage and settling times of penetrations. Data are provided regarding changes in recordings over time (see Methods, Drift Correction). The stimuli used to generate the data described are presented within a total of 30 minutes or less, minimizing any changes that might occur due to electrode drift. There is a minimum of 3 hours between different penetrations from the same animal.

      The impact of the paper is lessened because it emphasizes consistency but not in a consistent manner. Some demonstrations of consistency are shown for CSDs, but not quantified. Figure 4A is used to make a point about consistency in cell density, but across animals, whereas the previous text was pointing out inconsistency across penetrations. What if you took a 40 or 60 um column of tissue and computed cell density, then you would be comparing consistency across potentially similar scales. Overall, it is not clear how all of these different metrics compare quantitatively to each other in terms of consistency.

      As noted above, we have now added quantitative comparisons of consistency between different metrics. It is unclear why the reviewer felt that we use Figure 4A to describe consistency. That figure was a photograph from a previous publication simply showing the known differences in neuron density that are used to define layers in anatomical studies. This was intended to introduce the reader to known laminar differences. At any rate, we have been unable to contact the previous publishers of that work to obtain permission to use the figure. So we have removed that figure as it is unnecessary to illustrate the known differences in cell density that are used to define layers. We have kept the citation so that interested readers can refer to the publication.

      In many places, the text makes assertions that A is a consistent indicator of B, but then there appear to be clear counterexamples in the data shown in the figures. There is some sense that the reasoning is relying too much on examples, and not enough on statistical quantities.

      Without reference to specific examples we are not able to address this point.

      Overall

      Overall, this paper makes a solid argument in favor of using action potentials and stimulus driven responses, instead of CSD measurements, to assign cortical layers to electrode contacts in V1. It is nice to look at the data in this paper and to read the authors' highly educated interpretation and speculation about how useful such measurements will be in general to make layer assignments. It is easy to agree with much of what they say, and to hope that in the future there will be reliable, quantitative methods to make meaningful segmentations of neurons in terms of their differentiated roles in cortical computation. How much this will end up corresponding to the canonical layer numbering that has been used for many decades now remains unclear.

      Reviewer #3 (Public Review):

      Summary:

      Zhang et al. explored strategies for aligning electrophysiological recordings from high-density laminar electrode arrays (Neuropixels) with the pattern of lamination across cortical depth in macaque primary visual cortex (V1), with the goal of improving the spatial resolution of layer identification based on electrophysiological signals alone. The authors compare the current commonly used standard in the field - current source density (CSD) analysis - with a new set of measures largely derived from action potential (AP) frequency band signals. Individual AP band measures provide distinct cues about different landmarks or potential laminar boundaries, and together they are used to subdivide the spatial extent of array recordings into discrete layers, including the very thin layer 4A, a level of resolution unavailable when relying on CSD analysis alone for laminar identification. The authors compare the widths of the resulting subdivisions with previously reported anatomical measurements as evidence that layers have been accurately identified. This is a bit circular, given that they also use these anatomical measurements as guidelines limiting the boundary assignments; however, the strategy is overall sensible and the electrophysiological signatures used to identify layers are generally convincing. Furthermore, by varying the pattern of visual stimulation to target chromatically sensitive inputs known to be partially segregated by layer in V1, they show localized response patterns that lend confidence to their identification of particular sublayers.

      The authors compellingly demonstrate the insufficiency of CSD analysis for precisely identifying fine laminar structure, and in some cases its limited accuracy at identifying coarse structure. CSD analysis produced inconsistent results across array penetrations and across visual stimulus conditions and was not improved in spatial resolution by sampling at high density with Neuropixels probes. Instead, in order to generate a typical, informative pattern of current sources and sinks across layers, the LFP signals from the Neuropixels arrays required spatial smoothing or subsampling to approximately match the coarser (50-100 µm) spacing of other laminar arrays. Even with smoothing, the resulting CSDs in some cases predicted laminar boundaries that were inconsistent with boundaries estimated using other measures and/or unlikely given the typical sizes of individual layers in macaque V1. This point alone provides an important insight for others seeking to link their own laminar array recordings to cortical layers.

      They next offer a set of measures based on analysis of AP band signals. These measures include analyses of the density, average signal spread, and spike waveforms of single- and multi-units identified through spike sorting, as well as analyses of AP band power spectra and local coherence profiles across recording depth. The power spectrum measures in particular yield compact peaks at particular depths, albeit with some variation across penetrations, whereas the waveform measures most convincingly identified the layer 6-white matter transition. In general, some of the new measures yield inconsistent patterns across penetrations, and some of the authors' explanations of these analyses draw intriguing but rather speculative connections to properties of anatomy and/or responsivity. However, taken as a group, the set of AP band analyses appear sufficient to determine the layer 6-white matter transition with precision and to delineate intermediate transition points likely to correspond to actual layer boundaries.

      Strengths:

      The authors convincingly demonstrate the potential to resolve putative laminar boundaries using only electrophysiological recordings from Neuropixels arrays. This is particularly useful given that histological information is often unavailable for chronic recordings. They make a clear case that CSD analysis is insufficient to resolve the lamination pattern with the desired precision and offer a thoughtful set of alternative analyses, along with an order in which to consider multiple cues in order to facilitate others' adoption of the strategy. The widths of the resulting layers bear a sensible resemblance to the expected widths identified by prior anatomical measurements, and at least in some cases there are satisfying signatures of chromatic visual sensitivity and latency differences across layers that are predicted by the known connectivity of the corresponding layers. Thus, the proposed analytical toolkit appears to work well for macaque V1 and has strong potential to generalize to use in other cortical regions, though area-targeted selection of stimuli may be required.

      Weaknesses:

      The waveform measures, and in particular the unit density distribution, are likely to be sensitive to the criteria used for spike sorting, which differ widely among experimenters/groups, and this may limit the usefulness of this particular measure for others in the community. The analysis of detected unit density yields fluctuations across cortical depth which the authors attribute to variations in neural density across layers; however, these patterns seemed particularly variable across penetrations and did not consistently yield peaks at depths that should have high neuronal density, such as layer 2. Therefore, this measure has limited interpretability.

      While we agree that our electrophysiological measure of unit density does not strictly reflect anatomical neuronal density, we would like to remind the reader that we use this measure only to roughly estimate the correspondence between changes in density and likely layer assignments. We rely on other measures (e.g. AP power, AP power changes in response to visual stimuli) that have sharp borders and more clear transitions to assign laminar boundaries. Further, as noted in the reviewer’s list of strengths, the laminar assignments made with these measures are cross validated by differences in response latencies and sensitivity to different types of stimuli that are observed at different electrode depths.

      More generally, although the sizes of identified layers comport with typical sizes identified anatomically, a more powerful confirmation would be a direct per-penetration comparison with histologically identified boundaries. Ultimately, the absence of this type of independent confirmation limits the strength of their claim that veridical laminar boundaries can be identified from electrophysiological signals alone.

      As we have noted in response to similar comments from other reviewers, we are not aware of a method that would make this possible with sufficient resolution.

      Recommendations for the authors:

      Reviewing Editor (Recommendations For The Authors):

      The reviewers have indicated that their assessment would potentially be stronger if their advice for quantitative, statistically validated comparisons was followed, for example, to demonstrate variability or consistency of certain measures that are currently only asserted. Also, if available, some histological confirmation would be beneficial. It was requested that the use and modification of the layering from Balaram & Kaas is addressed, as well as dealing with inconsistencies in the scale bars on those figures. There are two figure permission issues that need to be resolved prior to publication: Balaram & Kaas 2014 in Fig 1A, Kelly & Hawken 2017 in Fig. 4A.

      Please see detailed responses to reviewer comments below. We have added new supplemental figures to quantitatively compare variability among metrics. As noted above, the suggested addition of data linking the electrophysiology directly to anatomical observations of laminar borders from the same electrode penetration is not feasible. The figure reused in Figure 1A is from open-access (CC BY) publication (Balaram & Kaas 2014). After reexamining the figure in the original study, we found that the inferred scale bar would give an obviously inaccurate result. So, we decided to remove the scale bar in Figure 1A. We haven’t received any reply from Springer Nature for Figure 4A permission, so we decided to remove the reused figure from our article (Kelly & Hawken 2017).

      Reviewer #1 (Recommendations For The Authors):<br /> Figure 4A has a different scale to Figure 4B-4F. It is better to add dashed lines to indicate the relationship between the cortical layers or overall range from Figure 4A to the corresponding layers in 4B to 4F.

      The reused figure in Figure 4A is removed due to permission issue. See also comments above.

      Reviewer #2 (Recommendations For The Authors):

      General comments

      This paper demonstrates that voltage signals in frequency bands higher than those used for LFP/CSD analysis can be used from high-density electrical contact recording to generate a map of cortical layering in macaque V1 at a higher spatial resolution than previously attained.

      My main concern is that much of this paper seems to show that properties of voltage signals recorded by electrodes change with depth in V1. This of course is well known and has been mapped by many who have advanced a single electrode micron-by-micron through the cortex, listening and recording as they go. Figure 4 shows that spike shapes can give a clear indication of GM to WM borders, and this is certainly true and well known. Figures 5 and 6 show that activity level on electrodes can indicate layers related to LGN input, and this is known. Figure 7 shows that latencies vary with layer, and this is certainly true as we know. A main point seems to be that CSD is highly inconsistent. This is important to know if CSD is simply never going to be a good measure for layering in V1, but it would require quantification and statistics to make a fair comparison.

      We are glad to see that the reviewer understands that changes in electrical signals across layers are well known and are expected to have particular traits that change across layers. We do not claim that have discovered anything that is unexpected or unknown. Instead, we introduce quantitative measures that are sensitive to these known differences (historically, often just heard with an audio monitor e.g. “LGN axon hash”). While the primary aim of this paper is not to show that Neuropixels probes can record some voltage signal properties that cannot be recorded with a single electrode before, we would like to point out that multi-electrode arrays have a very different sampling bias and also allow comparisons of simultaneous recordings across contacts with known fixed distances between them. For example our measure of “unit spread” could not be estimated with a single electrode.

      We’ve added Figure S3 to show quantitative comparison of variation between CSD and AP metrics. These figures add support to our prior, more anecdotal descriptions showing that CSDs are inconsistent and lack the resolution needed to identify thin layers.

      Some things are not explained very clearly. Like achromatic regions, and eye dominance - these are not quantified, and we don't know if they are mutually consistent - are achromatic/chromatic the same when tested through separate eyes? How consistent are these basic definitions? How definitive are they?

      The quantitative definitions of achromatic region/COFD and eye dominance column can be found in our previous paper (Li et al., 2022) cited in this article. The main theme of this study is to develop a strategy for accurately identifying layers, the more detailed functional analysis will be described in future publications.

      Specific comments

      The abstract refers to CSD analysis and CSD signals. Can you be more precise - do you aim to say that LFP signals in certain frequency bands are already known to lack spatial localization, or are you claiming to be showing that LFP signals lack spatial resolution? A major point of the results appears to be lack of consistency of CSD, but I do not see that in the Abstract. The first sentence in the abstract appears to be questionable based on the results shown here for V1.

      We have updated the Abstract to minimize confusion and misunderstanding.

      Scale bar on Fig 1A implies that layers 2-5 are nearly 3 mm thick. Can you explain this thickness? Other figures here suggest layers 1-6 is less than 2 mm thick. Note, in a paper by the same authors (Balaram et al) the scale bar (100 um, Figure 4) on similar macaque tissue suggests that the cortex is much thinner than this. Perhaps neither is correct, but you should attempt to determine an approximately accurate scale. The text defines granular as Layer 4, but the scale bar in A implies layer 4 is 1 mm thick, but this does not match the ~0.5 mm thickness consistent with Figure 1E, F. The text states that L4A is less then 100 um thick, but the markings and scale bar in Figure 1A suggests that it could be more than 100 um thick.

      We thank the reviewer for pointing out that there are clearly errors in the scale bars used in these previously published figures from another group. In the original figure 1(Balaram & Kaas 2014), histological slices were all scaled to one of the samples (Chimpanzee) without scale bar. After reexamining the scale bar we derived based on figure 2 of the original study, we found the same problem. Since relative widths of layers are more important than absolute widths in our study, we decided to remove the scale bar that we had derived and added to the Figure 1A.

      Line 157. Fix "The most commonly visual stimulus"

      Text has been changed

      Line 161. Fix "through dominate eye"

      Text has been changed

      Line 166. Please specify if the methods established and validated below are histological, or tell something about their nature here.

      The Abstract and Introduction already described the nature of our methods

      Line 184. Text is mixing 'dominant' and 'dominate', the former is better.

      Text has been changed accordingly

      Line 188. Can you clarify "beyond the time before a new stimulus transition". Are you generally referring to the fact that neuronal responses outlast the time between changes in the stimulus?

      That is correct. We are referring to the fact that neuronal responses outlast the time between changes in the stimulus. We have edited the text for clarity.

      Line 196. Fix "dominate eye" in two places.

      Text has been changed

      Line 196. The text seems to imply it is striking to find different response patterns for the two eyes, but given the OD columns, why should this be surprising?

      Since we didn’t find systematic comparison for CSD depth profiles of dominant/non-dominant eyes, or black/white in the past studies, we just describe what we saw in our data. The rational for testing each eye is that it is known that LGN projections from two eyes remain separated in direct input layer of V1, so comparing CSDs from two eyes could potentially help identifying input layers, such as L4C. Here we provide evidence showing that CSD profiles from two eyes deviate from naive expectations. For example, CSDs from black stimulus show less variation between two eyes, whereas CSDs from white stimulus could range from similar profile to drastically different ones across eyes.

      Line 198. Text like, "The most consistent..." is stating overall conclusions drawn by the authors before pointing the reader specifically to the evidence or the quantification that supports the statement.

      We’ve adjusted the text pointing to Figure S2, where depth profiles of all penetrations are visualized, and a newly added Figure S3, where the coefficients of variation for several metric profiles were shown.

      Line 200. "white stimulus is more variable" - the text does not tell us where/how this is supported with quantitative analysis/statistics.

      We’ve adjusted the text pointing to Figure S2, S3

      The metric in 4B is not explained, the text mentions the plot but the reader is unable to make any judgement without knowledge of the method, nor any estimate of error bars.

      The figure is first mentioned in section: Unit Density, and text in this section already described the definition of neuron density and unit density.  We’ve also modified the text pointing to the method section for details.

      Line 236. The text states the peak corresponds to L4C, but does not explain how the layer lines were determined.

      As described early in the CSD section, all layer boundaries are determined following the guide which layouts the strategy for how to draw borders by combining all metrics.

      At Line 296 the spike metrics section ends without providing a clear quantification of how useful the metrics will be. It is clear that the GM to WM boundary can be identified, but that can be found with single electrodes as well, as neurophysiologists get to see/hear the change in waveform as the electrode is advanced in even finer spatial increments than the 20 um spacing of the contacts here.

      The aim of this study is to develop an approach for accurately delineating layers simultaneously. The metrics we explored are considered estimation of well-known properties, so they can provide support for the correctness we hope to achieve. Here we first demonstrate the usefulness and later show the average across penetrations (Figure 9C-F). We are less concerned in quantification of how different factors affect precision and consistency of these metrics or how useful a single metric is, but rather, as described in the guide section, whether we can delineate all layers given all metrics.

      Line 302-306. Why this statement is made here is unclear, it interrupts the flow for a reason that perhaps will be explained later.

      This statement notes the insensitivity of this measure to temporal differences, introducing the value of incorporating a measure of how AP powers changes over time in the next section of the manuscript.

      Line 311. What is the reason to speculate about no canceling because of temporal overlap? Are you assuming a very sparse multi unit firing rate such that collisions do not happen?

      Here we describe a simple theoretical model in which spike waveforms only add without cancelling, then the power would be proportional to the number of spikes. In reality, spike waveform sometimes cancels causing the theoretical relationship to deteriorate to some degree.

      Lines 327-346. There is a considerable amount of speculation and arguing based on particular examples and there is a lack of quantification. Neuron density is mentioned, but not firing rate. would responses from fewer neurons with higher firing rate not be similar to more neurons with lower firing rates?

      According to the theoretical model we described, power is proportional to numbers of spikes which then depend on both neuron density and firing rate. So fewer neurons with higher firing rate would generate similar power to more neurons with lower firing rate. We’ve expanded the explanation of the model and added Figure S4 about the depth profile of firing rate. Text has also been adjusted pointing to the Figure S2, S3 about quantitively comparisons of variability.

      Line 348 states there is a precise link between properties and cortical layers, but the manuscript has not, up to this point, shown how that link was determined or quantified it.

      Through our generative model of power and the similarity between depth profile of firing rate and depth profile of neuron density (Figure S4), depth profile of power can be used to approximate depth profile of neuron density which is known to be closely correlated to cortical layering.

      Line 350. What is meant by "stochastic variability"?

      The text essentially says distances from electrode contact to nearby cell bodies were random, so closer cells have higher spike amplitudes and in turn result in higher power on a channel.

      The figures showing the two metrics, Pf and Cf, should be shown for the same data sets. The markings indicate that Fig 5 and Fig 6 show results from non-overlapping data sets. This does not build confidence about the results in the paper.

      Here we use typical profiles to demonstrate the characteristics of the power spectrum/coherence spectrum because of the variation across penetrations. We show later, in the guide section, all metrics for one penetration (another two cases in supplemental figures) and how to combine all metrics to derive layer delineations.

      Line 375 the statement is somewhat vague, "there are nevertheless sometimes cases where they can resolve uncertainties," can you please provide some quantitative support?

      We provided 3 examples in Figure 6, and more examples are shown in Figure 8, Figure S5, S6.

      Line 379. I believe the change you want to describe here is a change associated with a transition in the visual stimulus. It would be good to clarify this in the first several sentences here. Baseline can mean different things. I got the impression that your stimuli flip between states at a rate fast enough that signals do not really have time to return to a baseline.

      We rephrased the sentence to describe the metric more precisely. A pair of uniform colors flipping in 1.5 second intervals is usually long enough for spiking activities to decay to a saturated level.

      This section (379 - 398) continues a qualitative show-and-tell feel. There appears to be a lot of variability across the examples in Figure 7. How could you try to quantify this variability versus the variability in LFP? And, in this section overall, the text and figure legend don't really describe what the baseline is.

      Text adjustments are made to briefly describe the baseline window and point to the Method section where definitions are described in detail. We’ve added Figure S3 together with Figure S2 to address the variability across penetrations, stimuli, and metrics.

      Line 405 - 415. The discussion here does not consider that layers may not have well defined boundaries, the text gives the impression that there is some ultimate ground truth to which the metrics are being compared, but that may not be accurate.

      Except for a few layers/sublayers, such as L2, L3A, L3B, most layer boundaries of neocortex are well defined (Figure 1A) and histological staining of neurons/density and correlated changes in chemical content show very sharp transitions. The best of these staining methods is cytochrome oxidase, which shows sharp borders at the top and bottom of layer 4A, top and bottom of layer 4C, and the layer 5/6 border. There is also a sharp transition in neuronal cell body size and density at the top and bottom of layer 4Cb. The definition and delineation of all possible layers are constantly being refined, especially by accumulated knowledge of genetic markers of different cell types and connection patterns. In our study, we develop metrics to estimate well known anatomical and functional properties of different layers. We have also discussed layer boundaries that have been ambiguous to date and explained the reason and criteria to resolve them.

      Line 423. The text references Figure 1A in stating that relative thickness and position is crucial, but FIgure 1A does not provide that information and does not explain how it might be determined, or how much of a consensus there is. Also, the text does not consider that the electrode may go through the cortex at oblique angles, and not the same angle in each layer, and the relative thickness may not be a dependable reference.

      There are numerous studies that describe criteria to delineate cortical layers, the referenced article (Balaram & Kaas 2014) is used here as an example. We are not aware of any publication that has systematically compared the relative thickness of layers across the V1 surface of a given animal or across animals. Nevertheless, it is clear from the literature that there is considerable similarity across animals. Accordingly, we cannot know what the source of variability in overall cortical thickness in our samples is, but we do see considerable consistency in the relative thickness of the layers we infer from our measures. We illustrate the differences that we see across penetrations and consider likely causes, such as the extent to which the coverslip pressing down on the cortex might differentially compress the cortex at different locations within the chamber.

      The angle deviation of probe from surface will not change the relative thickness of layers, and the rigid linear probe is unlikely to bend in the cortex.

      Line 433. The term "Coherence" is used, clarify is this is you Cf from Figure 6. The text states, "marked decrease at the bottom of layer 6". Please clarify this, I do not see that in Figure 6.

      Text has been adjusted.

      In Figure 6, the locations of the lines between L1 and 2 do not seem to be consistent with respect to the subtle changes in light blue shading, across all three examples, yet the text on line 436 states that there is a clear transition.

      We feel that the language used accurately reflects what is shown in the figure. While the transition is not sharp, it is clear that there is a transition. This transition is not used to define this laminar border. We have edited the text to clarify that the L1/2 border is better defined based on the change in AP power which shows a sharp transition (Figure 7). 

      The text states that the boundary is also "always clear" from metrics... and sites Figure 5, but I do not see that this boundary is clear for all three examples in Figure 5.

      Text has been adjusted.

      Line 438. The text states that "it is not unusual for unit density to fall to zero below the L1/2 border (Figure 8E)", but surprisingly, the line in Figure 8 E does not even cover the indicated boundary between L1 and L2.

      At this point, the number of statements in the text that do not clearly and precisely correlate to the data in the figures is worrisome, and I think you could lose the confidence of readers at this point.

      We do not see any inconstancy between what is stated in our text and what is noted by the reviewer. The termination of the blue line corresponds to the location where no units are detected. This is the location where “unit density falls to zero”.  In this example, no units resolved through spike sorting until ~100mm beneath the L1/L2 boundary, which is exactly zero unity density (Figure 8E). That there are electrical signals in this region is clear from the AP power change (Figure 8C) which also shows the location of the L1/L2 border.

      Line 448. Text states that the 6A/B border is defined by a sharp boundary in AP power, but Figure 8A "AP power spectrum" does not show a sharp change at the A/B line. There is a peak in this metric in the middle to upper middle of 6A, but nothing so sharp to define a boundary between distinct layers, at least for penetration A2.

      Text has been adjusted.

      In Figure 8, the layer labels are not clear, whereas they are reasonably clear in the other figures.

      This is a technical problem regarding vector graphics that were not properly converted in PDF generation. We will upload each high-quality vector graphics when we finalize the version of record.

      The text emphasizes differences in L4B and L4C with respect to average power and coherence, but the transition seems a bit gradual from layer 3B to 4C in some examples in Figure 6. And in Figure 5, A3, there doesn't appear to be any particular transition along the line between 4B and 4C.

      In this guide section, we pointed out early that some metrics are good for some boundaries and variation exists between penetrations. We’ve expanded text emphasizing the importance of timing differences in DP/P for differentiating sublayers in L4. Lastly, in case of several unresolvable boundaries given all the metrics, the prior knowledge of relative thickness should be used.

      Line 466 provides prescriptions in absolute linear distances, but this is unwise given that cortex may be crossed at oblique angles by electrodes, particularly for parts of V1 that are not on the surface of the brain. Other parts of the text have emphasized relative measurements.

      Text has been changed using relative measurements.

      Line 507. The text says 9C and 4A are a good match, but the match does not look that good (4A has substantial dips at 0.5 and 0.75, and substantial peaks), and there is no quantification of fit. The error bars on 9C do not help show the variability across penetrations, they appear to be SEM, which shows that error bars get smaller as you average more data. It would seem more important to understand what is the variance in the density from one penetration to the next compared to the variance in density across layers.

      We have replaced “good match” with “roughly corresponds”. We note that we do not use unit density as a metric for identification of laminar borders and instead show that the expected locations of layers with higher neuronal density correspond to the locations where there are similar changes in unit density. It should be noted that Figure 9C is an average across many penetrations so should not be expected to show transitions that are as sharp in individual penetrations. Because of the figure permission issue, we have removed Figure 4A, and changed the text accordingly.

      Figure 9C-F show a lot of variability in the individual curves (dim gray lines) compared to the overall average. Does this show that these metrics are not reliable indicators at the level of single penetration, but show some trends across larger averages?

      In the beginning of the guide, we emphasized that all metrics should be combined for individual penetration, because some metrics are only reliable for delineating certain layer boundaries and the quality of data for the various measures varies between penetrations. The penetration average serves the same purpose explained in the previous question as an indicator that our layer delineation was not far off.

      The discussion mentions improvements in layer identification made here. Did this work check the assignments for these penetration against assignments made based on some form of ground truth? Previous methods would advance electrodes steadily, and make lesions, and carry out histology. Is there any way to tell how this method would compare to that?

      Even electrolytic lesions do not necessarily reveal ground truth and can be quite misleading. And their resolution is limited by lesion size. Lesions are typically variable in size, asymmetric and have variable shape and position relative to the location of the electrode tip, likely affected by the quality and location of electrical grounding and variations in current flow due to locations of blood vessels. A review of the published literature with electrode lesions shows that electrophysiological transitions are likely a far more accurate indicator of recording locations than post-mortem histology from electrolytic lesions. It is extremely rare for the locations of lesions to be precisely aligned to expected laminar transitions. See for example Chatterjee et al (Nature 2004). Also see several manuscripts from the Shapley lab. The lone rare exception of which we are aware is Blasdel and Fitzpatrick1984 in which consistently small and round lesions were produced and even these would be too large (~100 microns) to accurately identify layers if it were not for the fact that the electrode penetrations were very long and tangential to the cortical layers. 

      Reviewer #3 (Recommendations For The Authors):

      - The authors say (lines 360-362) that "Assuming spikes of a neuron spread to at least two adjacent recording channels, then the coherence between the two channels would be directly proportional to number of spikes, independent of spike amplitude." Has this been demonstrated? Very large amplitude spikes should show up on more channels than small amplitude spikes. Do waveform amplitudes and unit densities from the spike waveform analyses show consistent relationships to the power and/or coherence distributions over depth across penetrations?

      This part of the manuscript is providing a theoretical rational for what might be expected to affect the measures that we have derived. That is why we begin by stating that we are making an assumption. The answers to the reviewer’s questions are not known and have not been demonstrated. By beginning with this theoretical preface, we can point to cases where the data match these expectations as well as other cases where the data differ from the theoretical expectations.

      Coherence, by definition, is a normalized metric that is insensitive to amplitude. Spike amplitude mainly depends on how close the signal source is to electrode, and spike spread mainly depends on cell body size and shape given the same distance to electrode. Therefore, a very large spike amplitude could stem from a very close small cell to electrode, but would result in a small spike spread, especially axonal spikes (Figure 4B, red spike). Spike amplitudes on average are higher in L4C which matches the expectation that higher cell density would result, on average, closer cell body to electrode (Figure S4A). Nonetheless, the high-density small cell bodies in L4C result in a small spike spread (Figure 9D).

      - I suggest clarifying what is defined as the baseline window for the ΔP/P measure - is it the entire 10-150 ms response window used for the power spectrum analysis?

      Text adjustments are made in the Methods where the time windows are defined at the beginning of the CSD section. Only temporal change metrics (ΔCSD and ΔP/P) use the baseline window ([-40, 10]ms). The other two spectrum metrics (Power and Coherence) use the response window ([10, 150]ms).

      - Firing rate differs by cell type and, on average, differs by layer in V1. Many layer 2/3 neurons, for example, have low maximum firing rates when driven with optimized achromatic grating stimuli. To the extent that the generative models explaining the sources of power and coherence signals rely on the assumption that firing rates are matched across cortical depth, these models may be inaccurate. This assumption is declared only subtly, and late in the paper, but it is relevant to earlier claims.

      Text adjustments are made to explicitly describe the possibility that uneven depth profile of firing rate could counteract the depth profile of neuron density, resulting distorted or even a flat depth profile of power/coherence that deviates far from the depth profile of neuron density. In a newly added Figure S4, we first show the average firing rate profile during a set of stimuli (uniform color, static/drifting, achromatic/chromatic gratings), then specifically the PSTHs of the same stimuli shown in this study. It can be seen that layers receiving direct LGN inputs tend to fire at a higher rate (L4C, L6A). Firing rates in the PSTHs either roughly match across layers or are much higher in the densely packed layers. Therefore, the depth profile of firing rate contributes to rather than counteracting that of neuron density, enhancing the utility of the power/coherence profile for identification of correct layer boundaries.

      - Given the acute preparation used for recordings, I wonder whether tissue is available for histological evaluation. Although the layers identified are generally appropriate in relative size, it would be particularly compelling if the authors could demonstrate that the fraction of the cortical thickness occupied by each layer corresponded to the proportion occupied by that layer along the probe trajectory in histological sections. This would lend strength to the claim that these analyses can be used to identify layers in the absence of histology. Furthermore, variations in apparent cortical thickness could arise from different degrees of deviation from surface normal approach angles, which might be apparent by evaluation of histological material. I would add that variation in thickness on the scale shown in Fig. S4 is more likely to have an explanation of this kind.

      To serve other purposes unrelated to this study (identification of CO blobs), we cut the postmortem tissue in horizontal slices, so the histological comparison suggested cannot be made. The cortical thickness measured in this study had been affected not only by the angle deviation from the surface normal but also the swelling and compression of cortex. Nevertheless, evaluating the absolute thickness of cortex is not the main purpose of this study.

      Text and figure suggestions:

      - Fig 1A has been modified from Balaram & Kaas (2014) to revert to the Brodmann nomenclature scheme they argue against using in that paper; I wonder if they would object to this modification without explanation. Related, in the main text the authors initially refer to layers using Brodmann's labels with a secondary scheme (Hassler's) in parentheses and later drop the parenthetical labels; these conventions are not described or explained. Readers less familiar with the multiple nomenclature schemes for monkey V1 layers might be confused by the multiple labels without context, and could benefit from a brief description of the convention the authors have adopted.

      Throughout our article, we only used Brodmann’s naming convention because it has historically been adopted for old world monkey which we use in our study, whereas Hassler’s naming convention is more commonly used for new world monkey. Different naming conventions do not change our result, and it is out of scope for our study to discuss which nomenclature is more appropriate.

      - References to "dominate eye" throughout the text and figure legends should be replaced with "dominant eye."

      It has been changed throughout the article.

      - It is a bit odd to duplicate the same example in Fig. 2C and 2E. Perhaps a unique example would be a better use of the space.

      Here we first demonstrate the filtering effect, then compare profiles across different penetrations. The same example bridges the transition allowing side-by-side comparison.

      - The legend for Fig. 3 might be clearer if it simply listed the stimulus transitions for each column left to right, i.e. "black to white (non-dominant eye), white to black (non-dominant eye), black to white (dominant eye), ..."

      We feel that the icons are helpful. Here we want to show the stimulus colors directly to readers.

      - The misalignment between Fig. 4A vs. 4B-F, combined with the very small font size of the layer labels in Fig. 4B-F, make the visual comparison difficult. In Figs. 7 and 8, layer labels (and most labels in general) are much too small and/or low resolution to read easily. Overall, I would recommend increasing font size of labels in figures throughout the paper.

      The reused figure in Figure 4A is removed due to permission issue. Font sizes are adjusted.

      - Line 591 "using of high-density probes" should be "using high-density probes"

      Text has been changed accordingly

    1. Reviewer #1 (Public review):

      Summary:

      The authors quantified information in gesture and speech, and investigated the neural processing of speech and gestures in pMTG and LIFG, depending on their informational content, in 8 different time-windows, and using three different methods (EEG, HD-tDCS and TMS). They found that there is a time-sensitive and staged progression of neural engagement that is correlated with the informational content of the signal (speech/gesture).

      Strengths:

      A strength of the paper is that the authors attempted to combine three different methods to investigate speech-gesture processing.

      Weaknesses:

      (1) One major issue is that there is a tight anatomical coupling between pMTG and LIFG. Stimulating one area could therefore also result in stimulation of the other area (see Silvanto and Pascual-Leone, 2008). I therefore think it is very difficult to tease apart the contribution of these areas to the speech-gesture integration process, especially considering that the authors stimulate these regions in time windows that are very close to each other in both time and space (and the disruption might last longer over time).

      (2) Related to this point, it is unclear to me why the HD-TDCS/TMS is delivered in set time windows for each region. How did the authors determine this, and how do the results for TMS compare to their previous work from 2018 and 2023 (which describes a similar dataset+design)? How can they ensure they are only targeting their intended region since they are so anatomically close to each other?

      (3) As the EEG signal is often not normally distributed, I was wondering whether the authors checked the assumptions for their Pearson correlations. The authors could perhaps better choose to model the different variables to see whether MI/entropy could predict the neural responses. How did they correct the many correlational analyses that they have performed?

      (4) The authors use ROIs for their different analyses, but it is unclear why and on the basis of what these regions are defined. Why not consider all channels without making them part of an ROI, by using a method like the one described in my previous comment?

      (5) The authors describe that they have divided their EEG data into a "lower half" and a "higher half" (lines 234-236), based on entropy scores. It is unclear why this is necessary, and I would suggest just using the entropy scores as a continuous measure.

    1. Reviewer #1 (Public review):

      Neuronal activity spatiotemporal fine-tuning of cerebral blood flow balances metabolic demands of changing neuronal activity with blood supply. Several 'feed-forward' mechanisms have been described that contribute to activity-dependent vasodilation as well as vasoconstriction leading to a reduction in perfusion. Involved messengers are ionic (K+), gaseous (NO), peptides (e.g., NPY, VIP), and other messengers (PGE2, GABA, glutamate, norepinephrine) that target endothelial cells, smooth muscle cells, or pericytes. Contributions of the respective signaling pathways likely vary across brain regions or even within specific brain regions (e.g., across the cortex) and are likely influenced by the brain's physiological state (resting, active, sleeping) or pathological departures from normal physiology.

      The manuscript "Elevated pyramidal cell firing orchestrates arteriolar vasoconstriction through COX-2-derived prostaglandin E2 signaling" by B. Le Gac, et al. investigates mechanisms leading to activity-dependent arteriole constriction. Here, mainly working in brain slices from mice expressing channelrhodopsin 2 (ChR2) in all excitatory neurons (Emx1-Cre; Ai32 mice), the authors show that strong optogenetic stimulation of cortical pyramidal neurons leads to constriction that is mediated through the cyclooxygenase-2 / prostaglandin E2 / EP1 and EP3 receptor pathway with contribution of NPY-releasing interneurons and astrocytes releasing 20-HETE. Specifically, using a patch clamp, the authors show that 10-s optogenetic stimulation at 10 and 20 Hz leads to vasoconstriction (Figure 1), in line with a stimulation frequency-dependent increase in somatic calcium (Figure 2). The vascular effects were abolished in the presence of TTX and significantly reduced in the presence of glutamate receptor antagonists (Figure 3). The authors further show with RT-PCR on RNA isolated from patched cells that ~50% of analyzed cells express COX-1 or -2 and other enzymes required to produce PGE2 or PGF2a (Figure 4). Further, blockade of COX-1 and -2 (indomethacin), or COX-2 (NS-398) abolishes constriction. In animals with chronic cranial windows that were anesthetized with ketamine and medetomidine, 10-s long optogenetic stimulation at 10 Hz leads to considerable constriction, which is reduced in the presence of indomethacin. Blockade of EP1 and EP3 receptors leads to a significant reduction of the constriction in slices (Figure 5). Finally, the authors show that blockade of 20-HETE synthesis caused moderate and NPY Y1 receptor blockade a complete reduction of constriction.

      The mechanistic analysis of neurovascular coupling mechanisms as exemplified here will guide further in-vivo studies and has important implications for human neuroimaging in health and disease. Most of the data in this manuscript uses brain slices as an experimental model which contrasts with neurovascular imaging studies performed in awake (headfixed) animals. However, the slice preparation allows for patch clamp as well as easy drug application and removal. Further, the authors discuss their results in view of differences between brain slices and in vivo observations experiments, including the absence of vascular tone as well as blood perfusion required for metabolite (e.g., PGE2) removal, and the presence of network effects in the intact brain. The manuscript and figures present the data clearly; regarding the presented mechanism, the data supports the authors' conclusions. Some of the data was generated in vivo in head-fixed animals under anesthesia; in this regard, the authors should revise the introduction and discussion to include the important distinction between studies performed in slices, or in acute or chronic in-vivo preparations under anesthesia (reduced network activity and reduced or blockade of neuromodulation, or in awake animals (virtually undisturbed network and neuromodulatory activity). Further, while discussed to some extent, the authors could improve their manuscript by more clearly stating if they expect the described mechanism to contribute to CBF regulation under 'resting state conditions' (i.e., in the absence of any stimulus), during short or sustained (e.g., visual, tactile) stimulation, or if this mechanism is mainly relevant under pathological conditions; especially in the context of the optogenetic stimulation paradigm being used (10-s long stimulation of many pyramidal neurons at moderate-high frequencies) and the fact that constriction leading to undersupply in response to strongly increased neuronal activity seems counterintuitive?

    1. дному случае rdi, в другом случае rax.

      В Windows испольузется rax. тогда как в linux rdi. а rax для выполнение системных вызовов

    1. Who Can Name the Bigger Number?by Scott Aaronson [Author's blog] [This essay in Spanish] [This essay in French] [This essay in Chinese] In an old joke, two noblemen vie to name the bigger number. The first, after ruminating for hours, triumphantly announces "Eighty-three!" The second, mightily impressed, replies "You win." A biggest number contest is clearly pointless when the contestants take turns. But what if the contestants write down their numbers simultaneously, neither aware of the other’s? To introduce a talk on "Big Numbers," I invite two audience volunteers to try exactly this. I tell them the rules: You have fifteen seconds. Using standard math notation, English words, or both, name a single whole number—not an infinity—on a blank index card. Be precise enough for any reasonable modern mathematician to determine exactly what number you’ve named, by consulting only your card and, if necessary, the published literature. So contestants can’t say "the number of sand grains in the Sahara," because sand drifts in and out of the Sahara regularly. Nor can they say "my opponent’s number plus one," or "the biggest number anyone’s ever thought of plus one"—again, these are ill-defined, given what our reasonable mathematician has available. Within the rules, the contestant who names the bigger number wins. Are you ready? Get set. Go. The contest’s results are never quite what I’d hope. Once, a seventh-grade boy filled his card with a string of successive 9’s. Like many other big-number tyros, he sought to maximize his number by stuffing a 9 into every place value. Had he chosen easy-to-write 1’s rather than curvaceous 9’s, his number could have been millions of times bigger. He still would been decimated, though, by the girl he was up against, who wrote a string of 9’s followed by the superscript 999. Aha! An exponential: a number multiplied by itself 999 times. Noticing this innovation, I declared the girl’s victory without bothering to count the 9’s on the cards. And yet the girl’s number could have been much bigger still, had she stacked the mighty exponential more than once. Take , for example. This behemoth, equal to 9387,420,489, has 369,693,100 digits. By comparison, the number of elementary particles in the observable universe has a meager 85 digits, give or take. Three 9’s, when stacked exponentially, already lift us incomprehensibly beyond all the matter we can observe—by a factor of about 10369,693,015. And we’ve said nothing of or . Place value, exponentials, stacked exponentials: each can express boundlessly big numbers, and in this sense they’re all equivalent. But the notational systems differ dramatically in the numbers they can express concisely. That’s what the fifteen-second time limit illustrates. It takes the same amount of time to write 9999, 9999, and —yet the first number is quotidian, the second astronomical, and the third hyper-mega astronomical. The key to the biggest number contest is not swift penmanship, but rather a potent paradigm for concisely capturing the gargantuan. Such paradigms are historical rarities. We find a flurry in antiquity, another flurry in the twentieth century, and nothing much in between. But when a new way to express big numbers concisely does emerge, it’s often a byproduct of a major scientific revolution: systematized mathematics, formal logic, computer science. Revolutions this momentous, as any Kuhnian could tell you, only happen under the right social conditions. Thus is the story of big numbers a story of human progress. And herein lies a parallel with another mathematical story. In his remarkable and underappreciated book A History of π, Petr Beckmann argues that the ratio of circumference to diameter is "a quaint little mirror of the history of man." In the rare societies where science and reason found refuge—the early Athens of Anaxagoras and Hippias, the Alexandria of Eratosthenes and Euclid, the seventeenth-century England of Newton and Wallis—mathematicians made tremendous strides in calculating π. In Rome and medieval Europe, by contrast, knowledge of π stagnated. Crude approximations such as the Babylonians’ 25/8 held sway. This same pattern holds, I think, for big numbers. Curiosity and openness lead to fascination with big numbers, and to the buoyant view that no quantity, whether of the number of stars in the galaxy or the number of possible bridge hands, is too immense for the mind to enumerate. Conversely, ignorance and irrationality lead to fatalism concerning big numbers. Historian Ilan Vardi cites the ancient Greek term sand-hundred, colloquially meaning zillion; as well as a passage from Pindar’s Olympic Ode II asserting that "sand escapes counting." ¨ But sand doesn’t escape counting, as Archimedes recognized in the third century B.C. Here’s how he began The Sand-Reckoner, a sort of pop-science article addressed to the King of Syracuse: There are some ... who think that the number of the sand is infinite in multitude ... again there are some who, without regarding it as infinite, yet think that no number has been named which is great enough to exceed its multitude ... But I will try to show you [numbers that] exceed not only the number of the mass of sand equal in magnitude to the earth ... but also that of a mass equal in magnitude to the universe. This Archimedes proceeded to do, essentially by using the ancient Greek term myriad, meaning ten thousand, as a base for exponentials. Adopting a prescient cosmological model of Aristarchus, in which the "sphere of the fixed stars" is vastly greater than the sphere in which the Earth revolves around the sun, Archimedes obtained an upper bound of 1063 on the number of sand grains needed to fill the universe. (Supposedly 1063 is the biggest number with a lexicographically standard American name: vigintillion. But the staid vigintillion had better keep vigil lest it be encroached upon by the more whimsically-named googol, or 10100, and googolplex, or .) Vast though it was, of course, 1063 wasn’t to be enshrined as the all-time biggest number. Six centuries later, Diophantus developed a simpler notation for exponentials, allowing him to surpass . Then, in the Middle Ages, the rise of Arabic numerals and place value made it easy to stack exponentials higher still. But Archimedes’ paradigm for expressing big numbers wasn’t fundamentally surpassed until the twentieth century. And even today, exponentials dominate popular discussion of the immense. Consider, for example, the oft-repeated legend of the Grand Vizier in Persia who invented chess. The King, so the legend goes, was delighted with the new game, and invited the Vizier to name his own reward. The Vizier replied that, being a modest man, he desired only one grain of wheat on the first square of a chessboard, two grains on the second, four on the third, and so on, with twice as many grains on each square as on the last. The innumerate King agreed, not realizing that the total number of grains on all 64 squares would be 264-1, or 18.6 quintillion—equivalent to the world’s present wheat production for 150 years. Fittingly, this same exponential growth is what makes chess itself so difficult. There are only about 35 legal choices for each chess move, but the choices multiply exponentially to yield something like 1050 possible board positions—too many for even a computer to search exhaustively. That’s why it took until 1997 for a computer, Deep Blue, to defeat the human world chess champion. And in Go, which has a 19-by-19 board and over 10150 possible positions, even an amateur human can still rout the world’s top-ranked computer programs. Exponential growth plagues computers in other guises as well. The traveling salesman problem asks for the shortest route connecting a set of cities, given the distances between each pair of cities. The rub is that the number of possible routes grows exponentially with the number of cities. When there are, say, a hundred cities, there are about 10158 possible routes, and, although various shortcuts are possible, no known computer algorithm is fundamentally better than checking each route one by one. The traveling salesman problem belongs to a class called NP-complete, which includes hundreds of other problems of practical interest. (NP stands for the technical term ‘Nondeterministic Polynomial-Time.’) It’s known that if there’s an efficient algorithm for any NP-complete problem, then there are efficient algorithms for all of them. Here ‘efficient’ means using an amount of time proportional to at most the problem size raised to some fixed power—for example, the number of cities cubed. It’s conjectured, however, that no efficient algorithm for NP-complete problems exists. Proving this conjecture, called P¹ NP, has been a great unsolved problem of computer science for thirty years. Although computers will probably never solve NP-complete problems efficiently, there’s more hope for another grail of computer science: replicating human intelligence. The human brain has roughly a hundred billion neurons linked by a hundred trillion synapses. And though the function of an individual neuron is only partially understood, it’s thought that each neuron fires electrical impulses according to relatively simple rules up to a thousand times each second. So what we have is a highly interconnected computer capable of maybe 1014 operations per second; by comparison, the world’s fastest parallel supercomputer, the 9200-Pentium Pro teraflops machine at Sandia National Labs, can perform 1012 operations per second. Contrary to popular belief, gray mush is not only hard-wired for intelligence: it surpasses silicon even in raw computational power. But this is unlikely to remain true for long. The reason is Moore’s Law, which, in its 1990’s formulation, states that the amount of information storable on a silicon chip grows exponentially, doubling roughly once every two years. Moore’s Law will eventually play out, as microchip components reach the atomic scale and conventional lithography falters. But radical new technologies, such as optical computers, DNA computers, or even quantum computers, could conceivably usurp silicon’s place. Exponential growth in computing power can’t continue forever, but it may continue long enough for computers—at least in processing power—to surpass human brains. To prognosticators of artificial intelligence, Moore’s Law is a glorious herald of exponential growth. But exponentials have a drearier side as well. The human population recently passed six billion and is doubling about once every forty years. At this exponential rate, if an average person weighs seventy kilograms, then by the year 3750 the entire Earth will be composed of human flesh. But before you invest in deodorant, realize that the population will stop increasing long before this—either because of famine, epidemic disease, global warming, mass species extinctions, unbreathable air, or, entering the speculative realm, birth control. It’s not hard to fathom why physicist Albert Bartlett asserted "the greatest shortcoming of the human race" to be "our inability to understand the exponential function." Or why Carl Sagan advised us to "never underestimate an exponential." In his book Billions & Billions, Sagan gave some other depressing consequences of exponential growth. At an inflation rate of five percent a year, a dollar is worth only thirty-seven cents after twenty years. If a uranium nucleus emits two neutrons, both of which collide with other uranium nuclei, causing them to emit two neutrons, and so forth—well, did I mention nuclear holocaust as a possible end to population growth? ¨ Exponentials are familiar, relevant, intimately connected to the physical world and to human hopes and fears. Using the notational systems I’ll discuss next, we can concisely name numbers that make exponentials picayune by comparison, that subjectively speaking exceed as much as the latter exceeds 9. But these new systems may seem more abstruse than exponentials. In his essay "On Number Numbness," Douglas Hofstadter leads his readers to the precipice of these systems, but then avers: If we were to continue our discussion just one zillisecond longer, we would find ourselves smack-dab in the middle of the theory of recursive functions and algorithmic complexity, and that would be too abstract. So let’s drop the topic right here. But to drop the topic is to forfeit, not only the biggest number contest, but any hope of understanding how stronger paradigms lead to vaster numbers. And so we arrive in the early twentieth century, when a school of mathematicians called the formalists sought to place all of mathematics on a rigorous axiomatic basis. A key question for the formalists was what the word ‘computable’ means. That is, how do we tell whether a sequence of numbers can be listed by a definite, mechanical procedure? Some mathematicians thought that ‘computable’ coincided with a technical notion called ‘primitive recursive.’ But in 1928 Wilhelm Ackermann disproved them by constructing a sequence of numbers that’s clearly computable, yet grows too quickly to be primitive recursive. Ackermann’s idea was to create an endless procession of arithmetic operations, each more powerful than the last. First comes addition. Second comes multiplication, which we can think of as repeated addition: for example, 5´3 means 5 added to itself 3 times, or 5+5+5 = 15. Third comes exponentiation, which we can think of as repeated multiplication. Fourth comes ... what? Well, we have to invent a weird new operation, for repeated exponentiation. The mathematician Rudy Rucker calls it ‘tetration.’ For example, ‘5 tetrated to the 3’ means 5 raised to its own power 3 times, or , a number with 2,185 digits. We can go on. Fifth comes repeated tetration: shall we call it ‘pentation’? Sixth comes repeated pentation: ‘hexation’? The operations continue infinitely, with each one standing on its predecessor to peer even higher into the firmament of big numbers. If each operation were a candy flavor, then the Ackermann sequence would be the sampler pack, mixing one number of each flavor. First in the sequence is 1+1, or (don’t hold your breath) 2. Second is 2´2, or 4. Third is 3 raised to the 3rd power, or 27. Hey, these numbers aren’t so big! Fee. Fi. Fo. Fum. Fourth is 4 tetrated to the 4, or , which has 10154 digits. If you’re planning to write this number out, better start now. Fifth is 5 pentated to the 5, or with ‘5 pentated to the 4’ numerals in the stack. This number is too colossal to describe in any ordinary terms. And the numbers just get bigger from there. Wielding the Ackermann sequence, we can clobber unschooled opponents in the biggest-number contest. But we need to be careful, since there are several definitions of the Ackermann sequence, not all identical. Under the fifteen-second time limit, here’s what I might write to avoid ambiguity: A(111)—Ackermann seq—A(1)=1+1, A(2)=2´2, A(3)=33, etc Recondite as it seems, the Ackermann sequence does have some applications. A problem in an area called Ramsey theory asks for the minimum dimension of a hypercube satisfying a certain property. The true dimension is thought to be 6, but the lowest dimension anyone’s been able is prove is so huge that it can only be expressed using the same ‘weird arithmetic’ that underlies the Ackermann sequence. Indeed, the Guinness Book of World Records once listed this dimension as the biggest number ever used in a mathematical proof. (Another contender for the title once was Skewes’ number, about , which arises in the study of how prime numbers are distributed. The famous mathematician G. H. Hardy quipped that Skewes’ was "the largest number which has ever served any definite purpose in mathematics.") What’s more, Ackermann’s briskly-rising cavalcade performs an occasional cameo in computer science. For example, in the analysis of a data structure called ‘Union-Find,’ a term gets multiplied by the inverse of the Ackermann sequence—meaning, for each whole number X, the first number N such that the Nth Ackermann number is bigger than X. The inverse grows as slowly as Ackermann’s original sequence grows quickly; for all practical purposes, the inverse is at most 4. ¨ Ackermann numbers are pretty big, but they’re not yet big enough. The quest for still bigger numbers takes us back to the formalists. After Ackermann demonstrated that ‘primitive recursive’ isn’t what we mean by ‘computable,’ the question still stood: what do we mean by ‘computable’? In 1936, Alonzo Church and Alan Turing independently answered this question. While Church answered using a logical formalism called the lambda calculus, Turing answered using an idealized computing machine—the Turing machine—that, in essence, is equivalent to every Compaq, Dell, Macintosh, and Cray in the modern world. Turing’s paper describing his machine, "On Computable Numbers," is rightly celebrated as the founding document of computer science. "Computing," said Turing, is normally done by writing certain symbols on paper. We may suppose this paper to be divided into squares like a child’s arithmetic book. In elementary arithmetic the 2-dimensional character of the paper is sometimes used. But such use is always avoidable, and I think it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, on a tape divided into squares. Turing continued to explicate his machine using ingenious reasoning from first principles. The tape, said Turing, extends infinitely in both directions, since a theoretical machine ought not be constrained by physical limits on resources. Furthermore, there’s a symbol written on each square of the tape, like the ‘1’s and ‘0’s in a modern computer’s memory. But how are the symbols manipulated? Well, there’s a ‘tape head’ moving back and forth along the tape, examining one square at a time, writing and erasing symbols according to definite rules. The rules are the tape head’s program: change them, and you change what the tape head does. Turing’s august insight was that we can program the tape head to carry out any computation. Turing machines can add, multiply, extract cube roots, sort, search, spell-check, parse, play Tic-Tac-Toe, list the Ackermann sequence. If we represented keyboard input, monitor output, and so forth as symbols on the tape, we could even run Windows on a Turing machine. But there’s a problem. Set a tape head loose on a sequence of symbols, and it might stop eventually, or it might run forever—like the fabled programmer who gets stuck in the shower because the instructions on the shampoo bottle read "lather, rinse, repeat." If the machine’s going to run forever, it’d be nice to know this in advance, so that we don’t spend an eternity waiting for it to finish. But how can we determine, in a finite amount of time, whether something will go on endlessly? If you bet a friend that your watch will never stop ticking, when could you declare victory? But maybe there’s some ingenious program that can examine other programs and tell us, infallibly, whether they’ll ever stop running. We just haven’t thought of it yet. Nope. Turing proved that this problem, called the Halting Problem, is unsolvable by Turing machines. The proof is a beautiful example of self-reference. It formalizes an old argument about why you can never have perfect introspection: because if you could, then you could determine what you were going to do ten seconds from now, and then do something else. Turing imagined that there was a special machine that could solve the Halting Problem. Then he showed how we could have this machine analyze itself, in such a way that it has to halt if it runs forever, and run forever if it halts. Like a hound that finally catches its tail and devours itself, the mythical machine vanishes in a fury of contradiction. (That’s the sort of thing you don’t say in a research paper.) ¨ "Very nice," you say (or perhaps you say, "not nice at all"). "But what does all this have to do with big numbers?" Aha! The connection wasn’t published until May of 1962. Then, in the Bell System Technical Journal, nestled between pragmatically-minded papers on "Multiport Structures" and "Waveguide Pressure Seals," appeared the modestly titled "On Non-Computable Functions" by Tibor Rado. In this paper, Rado introduced the biggest numbers anyone had ever imagined. His idea was simple. Just as we can classify words by how many letters they contain, we can classify Turing machines by how many rules they have in the tape head. Some machines have only one rule, others have two rules, still others have three rules, and so on. But for each fixed whole number N, just as there are only finitely many distinct words with N letters, so too are there only finitely many distinct machines with N rules. Among these machines, some halt and others run forever when started on a blank tape. Of the ones that halt, asked Rado, what’s the maximum number of steps that any machine takes before it halts? (Actually, Rado asked mainly about the maximum number of symbols any machine can write on the tape before halting. But the maximum number of steps, which Rado called S(n), has the same basic properties and is easier to reason about.) Rado called this maximum the Nth "Busy Beaver" number. (Ah yes, the early 1960’s were a more innocent age.) He visualized each Turing machine as a beaver bustling busily along the tape, writing and erasing symbols. The challenge, then, is to find the busiest beaver with exactly N rules, albeit not an infinitely busy one. We can interpret this challenge as one of finding the "most complicated" computer program N bits long: the one that does the most amount of stuff, but not an infinite amount. Now, suppose we knew the Nth Busy Beaver number, which we’ll call BB(N). Then we could decide whether any Turing machine with N rules halts on a blank tape. We’d just have to run the machine: if it halts, fine; but if it doesn’t halt within BB(N) steps, then we know it never will halt, since BB(N) is the maximum number of steps it could make before halting. Similarly, if you knew that all mortals died before age 200, then if Sally lived to be 200, you could conclude that Sally was immortal. So no Turing machine can list the Busy Beaver numbers—for if it could, it could solve the Halting Problem, which we already know is impossible. But here’s a curious fact. Suppose we could name a number greater than the Nth Busy Beaver number BB(N). Call this number D for dam, since like a beaver dam, it’s a roof for the Busy Beaver below. With D in hand, computing BB(N) itself becomes easy: we just need to simulate all the Turing machines with N rules. The ones that haven’t halted within D steps—the ones that bash through the dam’s roof—never will halt. So we can list exactly which machines halt, and among these, the maximum number of steps that any machine takes before it halts is BB(N). Conclusion? The sequence of Busy Beaver numbers, BB(1), BB(2), and so on, grows faster than any computable sequence. Faster than exponentials, stacked exponentials, the Ackermann sequence, you name it. Because if a Turing machine could compute a sequence that grows faster than Busy Beaver, then it could use that sequence to obtain the D‘s—the beaver dams. And with those D’s, it could list the Busy Beaver numbers, which (sound familiar?) we already know is impossible. The Busy Beaver sequence is non-computable, solely because it grows stupendously fast—too fast for any computer to keep up with it, even in principle. This means that no computer program could list all the Busy Beavers one by one. It doesn’t mean that specific Busy Beavers need remain eternally unknowable. And in fact, pinning them down has been a computer science pastime ever since Rado published his article. It’s easy to verify that BB(1), the first Busy Beaver number, is 1. That’s because if a one-rule Turing machine doesn’t halt after the very first step, it’ll just keep moving along the tape endlessly. There’s no room for any more complex behavior. With two rules we can do more, and a little grunt work will ascertain that BB(2) is 6. Six steps. What about the third Busy Beaver? In 1965 Rado, together with Shen Lin, proved that BB(3) is 21. The task was an arduous one, requiring human analysis of many machines to prove that they don’t halt—since, remember, there’s no algorithm for listing the Busy Beaver numbers. Next, in 1983, Allan Brady proved that BB(4) is 107. Unimpressed so far? Well, as with the Ackermann sequence, don’t be fooled by the first few numbers. In 1984, A.K. Dewdney devoted a Scientific American column to Busy Beavers, which inspired amateur mathematician George Uhing to build a special-purpose device for simulating Turing machines. The device, which cost Uhing less than $100, found a five-rule machine that runs for 2,133,492 steps before halting—establishing that BB(5) must be at least as high. Then, in 1989, Heiner Marxen and Jürgen Buntrock discovered that BB(5) is at least 47,176,870. To this day, BB(5) hasn’t been pinned down precisely, and it could turn out to be much higher still. As for BB(6), Marxen and Buntrock set another record in 1997 by proving that it’s at least 8,690,333,381,690,951. A formidable accomplishment, yet Marxen, Buntrock, and the other Busy Beaver hunters are merely wading along the shores of the unknowable. Humanity may never know the value of BB(6) for certain, let alone that of BB(7) or any higher number in the sequence. Indeed, already the top five and six-rule contenders elude us: we can’t explain how they ‘work’ in human terms. If creativity imbues their design, it’s not because humans put it there. One way to understand this is that even small Turing machines can encode profound mathematical problems. Take Goldbach’s conjecture, that every even number 4 or higher is a sum of two prime numbers: 10=7+3, 18=13+5. The conjecture has resisted proof since 1742. Yet we could design a Turing machine with, oh, let’s say 100 rules, that tests each even number to see whether it’s a sum of two primes, and halts when and if it finds a counterexample to the conjecture. Then knowing BB(100), we could in principle run this machine for BB(100) steps, decide whether it halts, and thereby resolve Goldbach’s conjecture. We need not venture far in the sequence to enter the lair of basilisks. But as Rado stressed, even if we can’t list the Busy Beaver numbers, they’re perfectly well-defined mathematically. If you ever challenge a friend to the biggest number contest, I suggest you write something like this: BB(11111)—Busy Beaver shift #—1, 6, 21, etc If your friend doesn’t know about Turing machines or anything similar, but only about, say, Ackermann numbers, then you’ll win the contest. You’ll still win even if you grant your friend a handicap, and allow him the entire lifetime of the universe to write his number. The key to the biggest number contest is a potent paradigm, and Turing’s theory of computation is potent indeed. ¨ But what if your friend knows about Turing machines as well? Is there a notational system for big numbers more powerful than even Busy Beavers? Suppose we could endow a Turing machine with a magical ability to solve the Halting Problem. What would we get? We’d get a ‘super Turing machine’: one with abilities beyond those of any ordinary machine. But now, how hard is it to decide whether a super machine halts? Hmm. It turns out that not even super machines can solve this ‘super Halting Problem’, for the same reason that ordinary machines can’t solve the ordinary Halting Problem. To solve the Halting Problem for super machines, we’d need an even more powerful machine: a ‘super duper machine.’ And to solve the Halting Problem for super duper machines, we’d need a ‘super duper pooper machine.’ And so on endlessly. This infinite hierarchy of ever more powerful machines was formalized by the logician Stephen Kleene in 1943 (although he didn’t use the term ‘super duper pooper’). Imagine a novel, which is imbedded in a longer novel, which itself is imbedded in an even longer novel, and so on ad infinitum. Within each novel, the characters can debate the literary merits of any of the sub-novels. But, by analogy with classes of machines that can’t analyze themselves, the characters can never critique the novel that they themselves are in. (This, I think, jibes with our ordinary experience of novels.) To fully understand some reality, we need to go outside of that reality. This is the essence of Kleene’s hierarchy: that to solve the Halting Problem for some class of machines, we need a yet more powerful class of machines. And there’s no escape. Suppose a Turing machine had a magical ability to solve the Halting Problem, and the super Halting Problem, and the super duper Halting Problem, and the super duper pooper Halting Problem, and so on endlessly. Surely this would be the Queen of Turing machines? Not quite. As soon as we want to decide whether a ‘Queen of Turing machines’ halts, we need a still more powerful machine: an ‘Empress of Turing machines.’ And Kleene’s hierarchy continues. But how’s this relevant to big numbers? Well, each level of Kleene’s hierarchy generates a faster-growing Busy Beaver sequence than do all the previous levels. Indeed, each level’s sequence grows so rapidly that it can only be computed by a higher level. For example, define BB2(N) to be the maximum number of steps a super machine with N rules can make before halting. If this super Busy Beaver sequence were computable by super machines, then those machines could solve the super Halting Problem, which we know is impossible. So the super Busy Beaver numbers grow too rapidly to be computed, even if we could compute the ordinary Busy Beaver numbers. You might think that now, in the biggest-number contest, you could obliterate even an opponent who uses the Busy Beaver sequence by writing something like this: BB2(11111). But not quite. The problem is that I’ve never seen these "higher-level Busy Beavers" defined anywhere, probably because, to people who know computability theory, they’re a fairly obvious extension of the ordinary Busy Beaver numbers. So our reasonable modern mathematician wouldn’t know what number you were naming. If you want to use higher-level Busy Beavers in the biggest number contest, here’s what I suggest. First, publish a paper formalizing the concept in some obscure, low-prestige journal. Then, during the contest, cite the paper on your index card. To exceed higher-level Busy Beavers, we’d presumably need some new computational model surpassing even Turing machines. I can’t imagine what such a model would look like. Yet somehow I doubt that the story of notational systems for big numbers is over. Perhaps someday humans will be able concisely to name numbers that make Busy Beaver 100 seem as puerile and amusingly small as our nobleman’s eighty-three. Or if we’ll never name such numbers, perhaps other civilizations will. Is a biggest number contest afoot throughout the galaxy? ¨ You might wonder why we can’t transcend the whole parade of paradigms, and name numbers by a system that encompasses and surpasses them all. Suppose you wrote the following in the biggest number contest: The biggest whole number nameable with 1,000 characters of English text Surely this number exists. Using 1,000 characters, we can name only finitely many numbers, and among these numbers there has to be a biggest. And yet we’ve made no reference to how the number’s named. The English text could invoke Ackermann numbers, or Busy Beavers, or higher-level Busy Beavers, or even some yet more sweeping concept that nobody’s thought of yet. So unless our opponent uses the same ploy, we’ve got him licked. What a brilliant idea! Why didn’t we think of this earlier? Unfortunately it doesn’t work. We might as well have written One plus the biggest whole number nameable with 1,000 characters of English text This number takes at least 1,001 characters to name. Yet we’ve just named it with only 80 characters! Like a snake that swallows itself whole, our colossal number dissolves in a tumult of contradiction. What gives? The paradox I’ve just described was first published by Bertrand Russell, who attributed it to a librarian named G. G. Berry. The Berry Paradox arises not from mathematics, but from the ambiguity inherent in the English language. There’s no surefire way to convert an English phrase into the number it names (or to decide whether it names a number at all), which is why I invoked a "reasonable modern mathematician" in the rules for the biggest number contest. To circumvent the Berry Paradox, we need to name numbers using a precise, mathematical notational system, such as Turing machines—which is exactly the idea behind the Busy Beaver sequence. So in short, there’s no wily language trick by which to surpass Archimedes, Ackermann, Turing, and Rado, no royal road to big numbers. You might also wonder why we can’t use infinity in the contest. The answer is, for the same reason why we can’t use a rocket car in a bike race. Infinity is fascinating and elegant, but it’s not a whole number. Nor can we ‘subtract from infinity’ to yield a whole number. Infinity minus 17 is still infinity, whereas infinity minus infinity is undefined: it could be 0, 38, or even infinity again. Actually I should speak of infinities, plural. For in the late nineteenth century, Georg Cantor proved that there are different levels of infinity: for example, the infinity of points on a line is greater than the infinity of whole numbers. What’s more, just as there’s no biggest number, so too is there no biggest infinity. But the quest for big infinities is more abstruse than the quest for big numbers. And it involves, not a succession of paradigms, but essentially one: Cantor’s. ¨ So here we are, at the frontier of big number knowledge. As Euclid’s disciple supposedly asked, "what is the use of all this?" We’ve seen that progress in notational systems for big numbers mirrors progress in broader realms: mathematics, logic, computer science. And yet, though a mirror reflects reality, it doesn’t necessarily influence it. Even within mathematics, big numbers are often considered trivialities, their study an idle amusement with no broader implications. I want to argue a contrary view: that understanding big numbers is a key to understanding the world. Imagine trying to explain the Turing machine to Archimedes. The genius of Syracuse listens patiently as you discuss the papyrus tape extending infinitely in both directions, the time steps, states, input and output sequences. At last he explodes. "Foolishness!" he declares (or the ancient Greek equivalent). "All you’ve given me is an elaborate definition, with no value outside of itself." How do you respond? Archimedes has never heard of computers, those cantankerous devices that, twenty-three centuries from his time, will transact the world’s affairs. So you can’t claim practical application. Nor can you appeal to Hilbert and the formalist program, since Archimedes hasn’t heard of those either. But then it hits you: the Busy Beaver sequence. You define the sequence for Archimedes, convince him that BB(1000) is more than his 1063 grains of sand filling the universe, more even than 1063 raised to its own power 1063 times. You defy him to name a bigger number without invoking Turing machines or some equivalent. And as he ponders this challenge, the power of the Turing machine concept dawns on him. Though his intuition may never apprehend the Busy Beaver numbers, his reason compels him to acknowledge their immensity. Big numbers have a way of imbuing abstract notions with reality. Indeed, one could define science as reason’s attempt to compensate for our inability to perceive big numbers. If we could run at 280,000,000 meters per second, there’d be no need for a special theory of relativity: it’d be obvious to everyone that the faster we go, the heavier and squatter we get, and the faster time elapses in the rest of the world. If we could live for 70,000,000 years, there’d be no theory of evolution, and certainly no creationism: we could watch speciation and adaptation with our eyes, instead of painstakingly reconstructing events from fossils and DNA. If we could bake bread at 20,000,000 degrees Kelvin, nuclear fusion would be not the esoteric domain of physicists but ordinary household knowledge. But we can’t do any of these things, and so we have science, to deduce about the gargantuan what we, with our infinitesimal faculties, will never sense. If people fear big numbers, is it any wonder that they fear science as well and turn for solace to the comforting smallness of mysticism? But do people fear big numbers? Certainly they do. I’ve met people who don’t know the difference between a million and a billion, and don’t care. We play a lottery with ‘six ways to win!,’ overlooking the twenty million ways to lose. We yawn at six billion tons of carbon dioxide released into the atmosphere each year, and speak of ‘sustainable development’ in the jaws of exponential growth. Such cases, it seems to me, transcend arithmetical ignorance and represent a basic unwillingness to grapple with the immense. Whence the cowering before big numbers, then? Does it have a biological origin? In 1999, a group led by neuropsychologist Stanislas Dehaene reported evidence in Science that two separate brain systems contribute to mathematical thinking. The group trained Russian-English bilinguals to solve a set of problems, including two-digit addition, base-eight addition, cube roots, and logarithms. Some subjects were trained in Russian, others in English. When the subjects were then asked to solve problems approximately—to choose the closer of two estimates—they performed equally well in both languages. But when asked to solve problems exactly, they performed better in the language of their training. What’s more, brain-imaging evidence showed that the subjects’ parietal lobes, involved in spatial reasoning, were more active during approximation problems; while the left inferior frontal lobes, involved in verbal reasoning, were more active during exact calculation problems. Studies of patients with brain lesions paint the same picture: those with parietal lesions sometimes can’t decide whether 9 is closer to 10 or to 5, but remember the multiplication table; whereas those with left-hemispheric lesions sometimes can’t decide whether 2+2 is 3 or 4, but know that the answer is closer to 3 than to 9. Dehaene et al. conjecture that humans represent numbers in two ways. For approximate reckoning we use a ‘mental number line,’ which evolved long ago and which we likely share with other animals. But for exact computation we use numerical symbols, which evolved recently and which, being language-dependent, are unique to humans. This hypothesis neatly explains the experiment’s findings: the reason subjects performed better in the language of their training for exact computation but not for approximation problems is that the former call upon the verbally-oriented left inferior frontal lobes, and the latter upon the spatially-oriented parietal lobes. If Dehaene et al.’s hypothesis is correct, then which representation do we use for big numbers? Surely the symbolic one—for nobody’s mental number line could be long enough to contain , 5 pentated to the 5, or BB(1000). And here, I suspect, is the problem. When thinking about 3, 4, or 7, we’re guided by our spatial intuition, honed over millions of years of perceiving 3 gazelles, 4 mates, 7 members of a hostile clan. But when thinking about BB(1000), we have only language, that evolutionary neophyte, to rely upon. The usual neural pathways for representing numbers lead to dead ends. And this, perhaps, is why people are afraid of big numbers. Could early intervention mitigate our big number phobia? What if second-grade math teachers took an hour-long hiatus from stultifying busywork to ask their students, "How do you name really, really big numbers?" And then told them about exponentials and stacked exponentials, tetration and the Ackermann sequence, maybe even Busy Beavers: a cornucopia of numbers vaster than any they’d ever conceived, and ideas stretching the bounds of their imaginations. Who can name the bigger number? Whoever has the deeper paradigm. Are you ready? Get set. Go. References Petr Beckmann, A History of Pi, Golem Press, 1971. Allan H. Brady, "The Determination of the Value of Rado’s Noncomputable Function Sigma(k) for Four-State Turing Machines," Mathematics of Computation, vol. 40, no. 162, April 1983, pp 647- 665. Gregory J. Chaitin, "The Berry Paradox," Complexity, vol. 1, no. 1, 1995, pp. 26- 30. At http://www.umcs.maine.edu/~chaitin/unm2.html. A.K. Dewdney, The New Turing Omnibus: 66 Excursions in Computer Science, W.H. Freeman, 1993. S. Dehaene and E. Spelke and P. Pinel and R. Stanescu and S. Tsivkin, "Sources of Mathematical Thinking: Behavioral and Brain-Imaging Evidence," Science, vol. 284, no. 5416, May 7, 1999, pp. 970- 974. Douglas Hofstadter, Metamagical Themas: Questing for the Essence of Mind and Pattern, Basic Books, 1985. Chapter 6, "On Number Numbness," pp. 115- 135. Robert Kanigel, The Man Who Knew Infinity: A Life of the Genius Ramanujan, Washington Square Press, 1991. Stephen C. Kleene, "Recursive predicates and quantifiers," Transactions of the American Mathematical Society, vol. 53, 1943, pp. 41- 74. Donald E. Knuth, Selected Papers on Computer Science, CSLI Publications, 1996. Chapter 2, "Mathematics and Computer Science: Coping with Finiteness," pp. 31- 57. Dexter C. Kozen, Automata and Computability, Springer-Verlag, 1997. ———, The Design and Analysis of Algorithms, Springer-Verlag, 1991. Shen Lin and Tibor Rado, "Computer studies of Turing machine problems," Journal of the Association for Computing Machinery, vol. 12, no. 2, April 1965, pp. 196- 212. Heiner Marxen, Busy Beaver, at http://www.drb.insel.de/~heiner/BB/. ——— and Jürgen Buntrock, "Attacking the Busy Beaver 5," Bulletin of the European Association for Theoretical Computer Science, no. 40, February 1990, pp. 247- 251. Tibor Rado, "On Non-Computable Functions," Bell System Technical Journal, vol. XLI, no. 2, May 1962, pp. 877- 884. Rudy Rucker, Infinity and the Mind, Princeton University Press, 1995. Carl Sagan, Billions & Billions, Random House, 1997. Michael Somos, "Busy Beaver Turing Machine." At http://grail.cba.csuohio.edu/~somos/bb.html. Alan Turing, "On computable numbers, with an application to the Entscheidungsproblem," Proceedings of the London Mathematical Society, Series 2, vol. 42, pp. 230- 265, 1936. Reprinted in Martin Davis (ed.), The Undecidable, Raven, 1965. Ilan Vardi, "Archimedes, the Sand Reckoner," at http://www.ihes.fr/~ilan/sand_reckoner.ps. Eric W. Weisstein, CRC Concise Encyclopedia of Mathematics, CRC Press, 1999. Entry on "Large Number" at http://www.treasure-troves.com/math/LargeNumber.html. Back to Writings page Back to Scott's homepage Back to Scott's blog

      What even is the largest number that has real world use what would be the point of bigger numbers if we cant use the big numbers we have now for real world calculations?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      - Summary: 

      Recordings were made from the dentate nucleus of two monkeys during a decision-making task. Correlates of stimulus position and stimulus information were found to varying degrees in the neuronal activities. 

      We agree with this summary.

      - Strengths: 

      A difficult decision-making task was examined in two monkeys.

      We agree with this statement.

      - Weaknesses: 

      One of the monkeys did not fully learn the task. The manuscript lacked a coherent hypothesis to be tested, and no attempt was made to consider the possibility that this part of the brain may have little to do with the task that was being studied. 

      We understand the reviewers concern. It is correct that one of the monkeys (Mi) did not perform at a high level, but it should be noted that both monkeys learned significantly above chance level. Therefore, we would argue that both monkeys in fact did learn the task but Mi’s performance was suboptimal. This difference in the performance levels gave us a rare opportunity to dive deeper into the reasons why some animals perform better than the others and we show that Mi (the lower performing monkey) paid more attention to the outcome of the previous trial – this is evident from our behavioural and decoding models.

      We tested the overall hypothesis that neurons of the nucleus dentate can dynamically modulate their activity during a visual attention task, comprising not only sensorimotor but also cognitive attentional components. Many neurons in the dentate are multimodal (Figure 3C-D) which was something that was theorized. One of the specific hypotheses that we tested is that the dentate cells can be direction-selective for both the sensorimotor and cognitive component. Given that many of the recorded cells showed direction-selectivity in their firing rate modulation for gap directions and/or stimulus directions, we provide strong evidence that this hypothesis is correct. We have now spelled out this hypothesis more explicitly in the introduction of the revised version. We now also explain better why we tested this specific hypothesis. Indeed, earlier studies in primates such as those by Herzfeld and colleagues (2018, Nat. Neuro.) and van Es and colleagues (2019, Current Biol) have indicated that direction-selectivity of cerebellar activity may occur in various sensorimotor domains.

      We also appreciate the comment of this Reviewer that in our original submission we did not show our attempt to consider the possibility that this part of the brain may have little to do with the task that was being studied. We in fact did consider this possibility in that we successfully injected 3 ml of muscimol (5 μg/ml, Sigma Aldrich) into the dentate nucleus in vivo in one of the monkeys (Mo). This application resulted in a reduction of more than 10% in correct responses of the covert attention task after 45 minutes, whereas the performance remained the same following saline injections. Unfortunately, due to the timing of the experiments and Covid19-related laboratory restrictions we were unable to perform these experiments in the other monkey or repeat them in Mo. We aim to replicate this in future experiments and publish it when we have full datasets of at least two monkeys available. For this paper we have prioritized our tracing experiments, highlighting the connections of the dentate nucleus with attention related areas in brainstem and cortex in both monkeys, following perfusion.

      - Perhaps the large differences in performance between the two subjects can be used as a way to interpret the neural data's relationship to behavior, as it provided a source of variance. This is what we would hypothesize if we believed that this area of the brain is playing a significant role in the task. If one animal learns much more poorly, and this region of the brain is important for that behavior, then shouldn't there be clear, interpretable differences in the neural data? 

      We thank the Reviewer for this comment. We have added a new Supplementary Figure 2, in which we present the data for both monkeys separately in the revised manuscript. Comparing the two datasets however, we see more commonalities related to the significant learning in both monkeys than differences that might be related to their different levels of learning. We have therefore decided to show the different datasets transparently in the new Supplementary Figure 2, but to stay on the conservative side in our interpretations.

      - How should we look for these differences? A number of recent papers in mice have uncovered a large body of data showing that during the deliberation period, when the animal is interpreting a sensory stimulus (often using the whisker system), there is ramping activity in a principal component space among neurons that contribute to the decision. This ramping activity is present (in the PCA space) in the motor areas of the cortex, as well as in the medial and lateral cerebellar nuclei. Perhaps a similar computational approach would benefit the current manuscript. 

      We also appreciate this point. We have done the principal component analysis accordingly, and we indeed do find the ramping activity in several components of the dentate activity of both monkeys (Mi and Mo). We have now added a new Supplementary Figure 3 with the first three components of both correct and incorrect trials for Mi and Mo, highlighting their potential contribution.

      - What is the hypothesis that is being tested? That is, what do you think might be the function of this region of the cerebellum in this task? It seems to me that we are not entirely in the dark, as previous literature on mice decision-making tasks has produced a reasonable framework: the deliberation period coincides with ramping activity in many regions of the frontal lobe and the cerebellum. Indeed, the ramp in the cerebellum appears to be a necessary condition for the ramp to be present in the frontal lobe. Thus, we should see such ramping activity in this task in the dentate. When the monkey makes the wrong choice, the ramp should predict it. If you don't see the ramping activity, then it is possible that the hypothesis is wrong, or that you are not recording from the right place. 

      It is indeed one of our specific hypotheses that the dentate cells can be direction-selective for the preparing cognitive component and/or sensorimotor response. We provide evidence that this hypothesis may be correct when we analyze the regular time response curves (see Figure 2 and the new Supplementary Figure 2 where the data of both monkeys are now presented separately). Moreover, we have now verified this by analysing the ramping curves of PCA space (new Supplementary Figure 3) and firing frequency of DN neurons that modulated upon presentation of the C-stimulus (new Supplementary Figure 4). These figures and findings are now referred to in the main text.

      - As this is a difficult task that depends on the ability of the animals to understand the meaning of the cues, it is quite concerning that one of the monkeys performed poorly, particularly in the early sessions. Notably, the disparity between the two subjects is rather large: one monkey at the start of the recordings achieved a performance that was much better than the second monkey did at the end of the recording sessions. You highlighted the differences in performance in Figure 1D and mentioned that you started recording once the animals reached 60% performance. However, this did not make sense to me as the performance of Mi even after the final day of recording did not reach the performance of Mo on the first day of recording. Thus, in contrast to Mo, Mi appeared to be not ready for the task when the recording began.

      We understand this point. However, please note that the learning performance of the monkeys concerned retraining sessions after they had had several weeks of vacation. So, even though it is correct that one of the two monkeys had a very good consolidation and started already at a relatively high level on the first retraining session, the other one also started and ended at a level above chance level (the y-axis starts at 0.5). We now highlight this point better in the Results section.

      - One objective of having two monkeys is to illustrate that what is true in one animal is also true in the other. In some figures, you show that the neural data are significantly different, while in others you combine them into one. Thus, are you confident that the neural data across the animals should be combined, as you have done in Figure 2? Perhaps you can use the large differences in performance as a source of variance to find meaning in the neural data. 

      This is a valid question; as highlighted above, we have now addressed this point in the new Supplementary Figure 2, where the data for both monkeys are presented separately. Given the sample sizes and level of variances, it is in general difficult to draw conclusions about the potential differences and contributions, but the data are sufficiently transparent to observe common trends. With regard to linking differences in the neural data to the differences in performance level, please also consider Figure 4, the new Supplementary Figure 3 (with the ramping PCA component) and new Supplementary Figure 4 (with the additional analysis of the ramping activity of DN neurons that modulated upon presentation of the C-stimulus), which suggests that the ramping stage of Mo starts before that of Mi. This difference highlights the possibility that injecting accelerations of the simple spike modulations of Purkinje cells in the cerebellar hemispheres into the complex of cerebellar nuclei may be instrumental in improving the performance of responses to covert attention, akin to what has been shown for the impact of Purkinje cells of the vestibulocerebellum on eye movement responses to vestibular stimulation (De Zeeuw et al. 1995, J Neurophysiol). This possibility is now also raised in the Discussion.

      - How do we know that these neurons, or even this region of the brain, contribute to this task? When a new task is introduced, the contributions of the region of the brain that is being studied are usually established via some form of manipulation. This question is particularly relevant here because the two subjects differed markedly in their performance, yet in Figure 3 you find that a similar percentage of neurons are responding to the various elements of the task.

      We appreciate this question. As highlighted above, we are refraining from showing our muscimol manipulation (3 ml of 5 μg/ml muscimol, Sigma Aldrich), as it only concerns 1 successful dataset and 1 control experiment. We hope to replicate this reversible lesion experiment in the future and publish it when we have full new datasets of at least two monkeys available. As explained above, for this paper we have sacrificed both monkeys following a timed perfusion, so as to have similar survival times for the transport of the neuro-anatomical tracer involved.  

      - Behavior in both animals was better when the gap direction was up/down vs. left/right. Is this difference in behavior encoded during the time that the animal is making a decision? Are the dentate neurons better at differentiating the direction of the cue when the gap direction is up/right vs. left/right? 

      These data have now been included in the new Supplementary Figure 2; we did not observe any significant differences in this respect.

      Reviewer #2:

      - The authors trained monkeys to discriminate peripheral visual cues and associate them with planning future saccades of an indicated direction. At the same time, the authors recorded single-unit neural activity in the cerebellar dentate nucleus. They demonstrated that substantial fractions of DN cells exhibited sustained modulation of spike rates spanning task epochs and carrying information about stimulus, response, and trial outcome. Finally, tracer injections demonstrated this region of the DN projects to a large number of targets including several known to interconnect the visual attention network. The data compellingly demonstrate the authors' central claims, and the analyses are well-suited to support the conclusions. Importantly, the study demonstrates that DN cells convey many motor and nonmotor variables related to task execution, event sequencing, visual attention, and arguably decision-making/working memory. 

      We thank the Reviewer for this positive and constructive feedback.

      - The study is solid and I do not have major concerns, but only points for possible improvement. 

      We thank the Reviewer for this positive feedback.

      - A key feature of this data is the extended changes/ramps in DN output across epochs (Figure 2). Crudely, this presents a challenge for the view that DN output mainly drives motor effectors, as the saccade itself lasts only a tiny fraction of the overall task. Some discussion of this dichotomy in thinking about the function(s) of the cerebellum, vis a vis the multifarious DN targets the authors demonstrate here, etc., would be helpful. 

      We agree with the Reviewer and we have expanded our Discussion on this point, also now highlighting the outcome of the new PCA analysis recommended by Reviewer 1 (see the new Supplementary figure Figure 3).

      - A high-level suggestion on the data: the presentation of the data focuses (sensibly) on the representation of the stimulus and response epochs (Figures 2-3). Yet, the authors then show that from decoding, it is, in fact, a trial outcome that is best represented in the population (Figure 4). While there is nothing 'wrong' with this, it reads slightly incongruously, and the reader does a bit of a "double take" back to the previous figures to see if they missed examples of the trial-outcome signals, but the previous presentations only show correct trials. Consider adding somewhere in the first 3 main figures some neural data showing comparisons with incorrect trials. This way, the reader develops prior expectations for the outcome decoding result and frame of reference for interpreting it. On a related note, the text contains an earlier introduction of this issue (p24 last sentence) and p25 paragraph 1 cites Figure 3D and 3E for signals "related to the absence of reward" - but the caption says this includes only correct trials? 

      We thank the Reviewer for bringing up these points. We have addressed the textual suggestions. Moreover, we have done the PCA analysis suggested by Reviewer 1 for both the correct and incorrect trials (see Supplementary material).

      - P29: The discrepancy in retrograde labeling between monkeys (2 orders of magnitude): I realize the authors can't really do anything about this, but the difference is large enough to warrant concerns in the interpretation (how did the tracer spread over the drastically larger area? Isotropically? Could it cross more "hard boundaries" and incorporate qualitatively different inputs/outputs?). A small discussion of possible caveats in interpreting the outcomes would be helpful. 

      We fully agree with this comment. As highlighted in the text, in both monkeys we first identified the optimal points for injection in the dentate nucleus electrophysiologically and we used the same pump with the same settings to carry out the injections, but even so the differences are substantial. We suspect that the larger injection might have been caused by an air bubble trapped in the syringe or a deviation in the stock solution, but we can never be sure of that. We have added a potential explanation for the caveat that might have played a role.

      - And a list of quick points: 

      We have addressed all points listed below; we want to thank the Reviewer for bringing them up.

      P3 paragraph 2 needs comma "in daily life,". 

      P4 paragraph 2 "C-gap" terminology not previously defined. 

      P4 paragraph 2 "animals employed different behavioral strategies". Grammatically, you should probably say "each animal employed a different behavioral strategy," but also scientifically the paragraph doesn't connect this claim to anything about the DN (whereas, e.g., the abstract does make this connection clear). 

      P5 paragraph 1 "theca" should be "the". 

      P6 paragraph 1 problem with ignashenkova citation insert. 

      P10 paragraph 1 I think the spike rate "difference between highest and lowest" is not exactly the same as "variance," you might want to change the terminology. 

      P10 paragraph 1 should probably say "To determine if a cell preferentially modulated". 

      P10 paragraph 1 last sentence the last clause could be clearer. 

      P17 paragraph 2 should be something like "as well as those by Carpenter and..."? 

      P20 caption: consider "...directionality in the task: only one C-stim...". 

      P20 caption: consider "to the left and right in the [L/R] task...to the top/bottom in the [U/D] task". 

      Fig1E and S1 - is there a physical meaning of the "weight" unit, and if none, can this be transformed into a more meaningful unit? 

      P21 paragraph 1 consider "activity was recorded for 304 DN neurons...". 

      P21 paragraph 1 "correlations with the temporal windows" it's not clear how activity can "correlate" with a time window, consider rephrasing (activity levels changed during these time epochs, depending on stimulus identity). 

      P21 paragraph 1 should be "by comparing the number of spikes in a bin...". 

      P22 paragraph 2 "when we aligned the neurons to the time of maximum change" needs clarification. The maximum change of what? And per neuron? Across the population? 

      P22 paragraph 2 "than that of the facilitating" should be "than did the facilitating units". 

      P24 paragraph 1 needs a comma and rewording "Within each direction, trials are sorted by the time of saccade onset". 

      P24 paragraph 1 should probably say "Same as in G, but for suppressed cells". 

      P24 paragraph 2 should say "more than one task event" not "events". 

      P24 paragraph 2 needs a comma "To fully characterize the neural responses, we fitted". 

      P25 paragraph 1 should probably say "we sampled from similar populations of DN". 

      P34 paragraph 3 consider rephrasing the sentence that contains both "dissociation" and "dissociate". 

      P37 last line: consider "coordination of cerebellum and cerebral cortex *in* higher order mental..."? 

      P38 paragraph 1 citation needed for "kinematics of goal-directed hand actions of others"? 

      P38 paragraph 1 commas probably not needed "map visual input, from high-level visual regions, onto..." 

      References

      - Herzfeld D.J., Kojima Y, Soetedjo R, Shadmehr R (2018) Encoding of error and learning to correct that error by the Purkinje cells of the cerebellum. Nat Neurosci 21:736–743.

      - van Es, D.M., van der Zwaag W., and Knapen T. (2019) Topographic Maps of Visual Space in the Human Cerebellum. Current Biol Volume 29, Issue 10p1689-1694.e3May 20.

      - De Zeeuw CI, Wylie DR, Stahl JS, Simpson JI. (1995) Phase relations of Purkinje cells in the rabbit flocculus during compensatory eye movements. J Neurophysiol. Nov;74(5):2051-64. doi: 10.1152/jn.1995.74.5.2051.

    1. Dans cette vidéo, la commande utilisée pour le presse papier est  pbcopy  car le système d'exploitation est Mac. Cependant, si vous utilisez Windows, la commande à utiliser est  clip

      Même en utilisant le clip je n'ai pas pu copier la clé. Comment faire

    1. Reviewer #3 (Public review):

      Summary:

      Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participants report on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.

      Strengths and Weaknesses:

      One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.

      A potential area for improvement in this study is the use of single time-points from peak decoding accuracy to generate current source density topography maps. While we recognize that the decoding analysis employed here differs from traditional ERP approaches, the robustness of the findings could be enhanced by exploring current source density over relevant time windows. Event-related peaks, both in terms of timing and amplitude, can sometimes be influenced by noise or variability in trial-averaged EEG data, and a time-window analysis might provide a more comprehensive and stable representation of the underlying neural dynamics.

      It is helpful that the authors show the standard error of the mean for the classifier performance over time. A similar indication of a measure of variance in other figures could improve clarity and transparency.<br /> That said, the paper appears solid regarding technical issues overall. The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.

      Impact of the Work:

      This study effectively demonstrates a phenomenon that has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.

  8. Nov 2024
    1. Reviewer #2 (Public review):

      Summary:

      The manuscript by Liang and Guan provides an impressive attempt to characterize the conformational free energy landscape of a melibiose permease (MelB), a symporter member of major facilitator superfamily (MFS) of transporters. Although similar studies have been conducted previously for other members of MFS, each member or subfamily has its own unique features that make the employment of such methods quite challenging. While the methodology is indeed impressive, characterizing the coupling between large-scale conformational changes and substrate binding in membrane transporters is quite challenging and requires a sophisticated methodology. The conclusions obtained from the three sets of path-optimization and free energy calculations done by the authors are generally supported by the provided data and certainly add to our understanding of how sodium binding facilitates the transport of melibiose in MelB. However, the data is not generated reliably which questions the relevance of the conclusions as well. I particularly have some concerns regarding the implementation of the methodology that I will discuss below.

      (1) In enhanced sampling techniques, often much attention is given to the sampling algorithm. Although the sampling algorithm is quite important and this manuscript has chosen an excellent pair: string method with swarms of trajectories (SMwST) and replica-exchange umbrella sampling (REUS) for this task, there are other important factors that must be taken into account. More specifically, the collective variables used and the preparation of initial conformations for sampling. I have objectives for both of these (particularly the latter) that I detail below. Overall, I am not confident that the free energy profiles generated (summarized in Figure 5) are reliable, and unfortunately, much of the data presented in this manuscript heavily relies on these free energy profiles.

      (2) The authors state that they have had an advantage over other similar studies in that they had two endpoints of the string to work from experimental data. I agree that this is an advantage. However, this could lead to some dangerous flaws in the methodology if not appropriately taken into account. Proteins such as membrane transporters have many slow degrees of freedom that can be fully captured within tens of nanoseconds (90 ns was the simulation time used here for the REUS). Biased sampling allows us to overcome this challenge to some extent, but it is virtually impossible to take into account all slow degrees of freedom in the enhanced sampling protocol (e.g., the collective variables used here do not represent anything related to sidechain dynamics). Therefore, if one mixes initial conformations that form different initial structures (e.g., an OF state and an IF state from two different PDB files), it is very likely that despite all equilibration and relaxation during SMwST and REUS simulations, the conformations that come from different sources never truly mix. This is dangerous in that it is quite difficult to detect such inconsistencies and from a theoretical point of view it makes the free energy calculations impossible. Methods such as WHAM and its various offshoots all rely on overlap between neighboring windows to calculate the free energy difference between two windows and the overlap should be in all dimensions and not just the ones that we use for biasing. This is related to well-known issues such as hidden barriers and metastability. If one uses two different structures to generate the initial conformations, then the authors need to show their sampling has been long enough to allow the two sets of conformations to mix and overlap in all dimensions, which is a difficult task to do.

      (3) I also have concerns regarding the choice of collective variables. The authors have split the residues in each transmembrane helix into the cyto- and periplasmic sides. Then they have calculated the mass center distance between the cytoplasmic sides of certain pairs of helices and have also done the same for the periplasmic side. Given the shape of a helix, this does not seem to be an ideal choice since rather than the rotational motion of the helix, this captures more the translational motion of the helix. However, the transmembrane helices are more likely to undergo rotational motion than the translational one.

      (4) Convergence: String method convergence data does not show strong evidence for convergence (Figure S2) in my opinion. REUS convergence is also not discussed. No information is provided on the exchange rate or overlap between the windows.

    2. Author response:

      Reviewer 1:

      (1) Free energy barriers appear to be very high for a substrate transport process. In Figure 3, the transitions from IF (Inward facing) to OF (Outward facing) state appear to have a barrier of 12 kcal/mol. Other systems with mutant or sodium unbound have even higher barriers. This does not seem consistent with previous studies where transport mechanisms of transporters have been explored using molecular dynamics. 

      First, in Figure 3, the transition from IF to OF state doesn’t have a barrier of 12 kcal/mol. The IFF to OFB transition is almost barrierless, and from OFB to OFF is ~5 kcal/mol, which is also evident in Figure 2.

      If the reviewer was referring to the transition from OFB to IFB states, the barrier is 6.8 kcal/mol (Na+ bound state), and the rate-limiting barrier in the entire sugar transport process (Na+ bound state) is 8.4 kcal/mol, as indicated in Figure 2 and Table 1, which is much lower than the 12 kcal/mol barrier the reviewer mentioned. When the Na+ is unbound, the barrier can be as high as 12 kcal/mol, but it is this high barrier that leads to our conclusion that the Na+ binding is essential for sugar transport, and the 12 kcal/mol barrier indicates an energetically unfavorable sugar translocation process when the Na+ is unbound, which is unlikely to be the major translocation process in nature. 

      Even for the 12 kcal/mol barrier reported for the Na+ unbound state, it is still not too high considering the experimentally measured MelB sugar active transport rate, which is estimated to be on the order of 10 to 100 s-1. This range of transport rate is typical for similar MFS transporters such as the lactose permease (LacY), which has an active transport rate of 20 s-1. The free energy barrier associated with the active transport is thus on the order of ~15-16 kcal/mol based on transition state theory assuming kBT/h as the prefactor. This experimentally estimated barrier is higher than all of our calculated barriers. Our calculated barrier for the sugar translocation with Na+ bound is 8.4 kcal/mol, which means an additional ~7-8 kcal/mol barrier is contributed by the Na+ release process after sugar release in the IFF state. This is a reasonable estimation of the Na+ unbinding barrier.

      Therefore, whether the calculated barrier is too high depends on the experimental kinetics measurements, which are often challenging to perform. Based on the existing experimental data, the MFS transporters are

      usually relatively slow in their active transport cycle. The calculated barrier thus falls within the reasonable range considering the experimentally measured active transport rates.

      (2) Figure 2b: The PMF between images 20-30 shows the conformation change from OF to IF, where the occluded (OC) state is the highest barrier for transition. However, OC state is usually a stable conformation and should be in a local minimum. There should be free energy barriers between OF and OC and in between OC and IF.  

      First, the occluded state (OCB) is not between images 20-30, it is between images 10 to 20. Second, there is no solid evidence that the OCB state is a stable conformation and a local minimum. Existing experimental structures of MFS transporters seldom have the fully occluded state resolved.

      (3) String method pathway is usually not the only transport pathway and alternate lower energy pathways should be explored. The free energy surface looks like it has not deviated from the string pathway. Longer simulations can help in the exploration of lower free energy pathways. 

      We agree with the reviewer that the string method pathway is usually not the only transport pathway and alternate lower energy pathways could exist. However, we also note that even if the fully occluded state is a local minimum and our free energy pathway does visit this missing local minimum after improved sampling, the overall free energy barrier will not be lowered from our current calculated value. This is because the current rate-limiting barrier arises from the transition from the OFB state to the IFF state, and the barrier top corresponds to the sugar molecule passing through the most constricted region in the cytoplasmic region, i.e., the IFC intermediate state visited after the IFB state is reached. Therefore, the free energy difference between the OFB state and the IFC state will not be changed by another hypothetical local minimum between the OFB and IFB states, i.e., the occluded OCB state. In other words, a hypothetical local minimum corresponding to the occluded state, even if it exists, will not decrease the overall rate-limiting barrier and may even increase it further, depending on the depth of the local minimum and the additional barriers of entering and escaping from this new minimum. 

      (4) The conformational change in transporters from OF to IF state is a complicated multi-step process. First, only 10 images in the string pathway are used to capture the transition from OF to IF state. I am not sure is this number is enough to capture the process. Second, the authors have used geodesic interpolation algorithm to generate the intermediate images. However, looking at Figure 3B, it looks like the transition pathway has not captured the occluded (OC) conformation, where the transport tunnel is closed at both the ends. Transporters typically follow a stepwise conformational change mechanism where OF state transitions to OC and then to IF state. It appears that the interpolation algorithm has created a hourglasslike state, where IF gates are opening and OF gates are closing simultaneously thereby creating a state where the transport tunnel is open on both sides of the membrane. These states are usually associated with high energy. References 30-42 cited in the manuscript reveal a distinct OC state for different transporters. 

      In our simulations, even with 10 initial images representing the OF to IF conformational transition, the occluded state is sampled in the final string pathway. There is an ensemble of snapshots where the extracellular and intracellular gates are both relatively narrower than the OF and IF states, preventing the sugar from leaking into either side of the bulk solution. In contrast to the reviewer’s guess, we never observed an hourglass-like state in our simulation where both gates are open. Figure 3B is a visual representation of the backbone structure of the OCB state without explicitly showing the actual radius of the gating region, which also depends on the side chain conformations. Thus, Figure 3B alone cannot be used to conclude that we are dominantly sampling an hourglass-like intermediate conformation instead of the occluded state, as mentioned by the reviewer. 

      Moreover, not all references in 30-42 have sampled the occluded state since many of them did not even simulate the substrate translocation process at all. For the ones that did sample substrate translocation processes, only two of them were studying the cation-coupled MFS family symporter (ref 38, 40) and they didn’t provide the PMF for the entire translocation process. There is no strong evidence for a stable minimum corresponding to a fully occluded state in these two studies.  In fact, different types of transporters with different coupling cations may exhibit different stability of the fully occluded state. For example, the fully occluded state has been experimentally observed for some MFS transporters, such as multidrug transporter EmrD, but not for others, such as lactose permease LacY. Thus, it is not generally true that a stable, fully-occluded state exists in all transporters, and it highly depends on the specific type of transporter and the coupling ion under study. 

      Reviewer 2:

      The manuscript by Liang and Guan provides an impressive attempt to characterize the conformational free energy landscape of a melibiose permease (MelB), a symporter member of major facilitator superfamily (MFS) of transporters. Although similar studies have been conducted previously for other members of MFS, each member or subfamily has its own unique features that make the employment of such methods quite challenging. While the methodology is indeed impressive, characterizing the coupling between large-scale conformational changes and substrate binding in membrane transporters is quite challenging and requires a sophisticated methodology. The conclusions obtained from the three sets of path-optimization and free energy calculations done by the authors are generally supported by the provided data and certainly add to our understanding of how sodium binding facilitates the transport of melibiose in MelB. However, the data is not generated reliably which questions the relevance of the conclusions as well. I particularly have some concerns regarding the implementation of the methodology that I will discuss below. 

      (1) In enhanced sampling techniques, often much attention is given to the sampling algorithm. Although the sampling algorithm is quite important and this manuscript has chosen an excellent pair: string method with swarms of trajectories (SMwST) and replica-exchange umbrella sampling (REUS) for this task, there are other important factors that must be taken into account. More specifically, the collective variables used and the preparation of initial conformations for sampling. I have objectives for both of these (particularly the latter) that I detail below. Overall, I am not confident that the free energy profiles generated (summarized in Figure 5) are reliable, and unfortunately, much of the data presented in this manuscript heavily relies on these free energy profiles. 

      Since comments (1) and (2) from this review are related, please see our response to (2) below. 

      (2) The authors state that they have had an advantage over other similar studies in that they had two endpoints of the string to work from experimental data. I agree that this is an advantage. However, this could lead to some dangerous flaws in the methodology if not appropriately taken into account. Proteins such as membrane transporters have many slow degrees of freedom that can be fully captured within tens of nanoseconds (90 ns was the simulation time used here for the REUS). Biased sampling allows us to overcome this challenge to some extent, but it is virtually impossible to take into account all slow degrees of freedom in the enhanced sampling protocol (e.g., the collective variables used here do not represent anything related to sidechain dynamics). Therefore, if one mixes initial conformations that form different initial structures (e.g., an OF state and an IF state from two different PDB files), it is very likely that despite all equilibration and relaxation during SMwST and REUS simulations, the conformations that come from different sources never truly mix. This is dangerous in that it is quite difficult to detect such inconsistencies and from a theoretical point of view it makes the free energy calculations impossible. Methods such as WHAM and its various offshoots all rely on overlap between neighboring windows to calculate the free energy difference between two windows and the overlap should be in all dimensions and not just the ones that we use for biasing. This is related to well-known issues such as hidden barriers and metastability. If one uses two different structures to generate the initial conformations, then the authors need to show their sampling has been long enough to allow the two sets of conformations to mix and overlap in all dimensions, which is a difficult task to do. 

      We partly agree with the reviewer in that it is challenging to investigate whether the structures generated from the two different initial structures are sufficiently mixed in terms of orthogonal degrees of freedom outside the CV space during our string method and REUS simulations. We acknowledge that our simulations are within 100 ns for each REUS window, and there could be some slow degrees of freedom that are not fully sampled within this timescale. However, the conjectures and concerns raised by the reviewer are somewhat subjective in that they are almost impossible to be completely disproven. In a sense, these concerns are essentially the same as the general suspicion that the biomolecular simulation results are not completely converged, which cannot be fully ruled out for relatively complex biomolecular systems in any computational study involving MD simulations.  We also note that comparison among the PMFs of different cation bound/unbound states will have some error cancellation effects because of the consistent use of the same sampling methods for all three systems. Our main conclusions regarding the cooperative binding and transport of the two substrates lie in such comparison of the PMFs and additionally on the unbiased MD simulations. Thus, although there could be insufficient sampling, our key conclusions based on the relative comparison between the PMFs are more robust and less likely to suffer from insufficient sampling.

      (3) I also have concerns regarding the choice of collective variables. The authors have split the residues in each transmembrane helix into the cyto- and periplasmic sides. Then they have calculated the mass center distance between the cytoplasmic sides of certain pairs of helices and have also done the same for the periplasmic side. Given the shape of a helix, this does not seem to be an ideal choice since rather than the rotational motion of the helix, this captures more the translational motion of the helix. However, the transmembrane helices are more likely to undergo rotational motion than the translational one. 

      Our choice of CVs not only captures the translational motion but also the rotational motion of the helix. Consider a pair of helices. If there is a relative rotation in the angle between the two helices, causing the extracellular halves of the two helices to get closer and the intracellular halves to be more separated, this rotational motion can be captured as the decrease of one CV describing the extracellular distance and increase in the other CV describing the intracellular distance between the two helices. Reversely, if one of the two CVs is forced to increase and the other one forced to decrease, it can, in principle, bias the relative rotation of the two helices with respect to each other. Indeed, comparing Figure 3 with Figure S4, the reorientation of the helices with respect to the membrane normal (Fig. S4) is accompanied by the simultaneous decrease and increase in the pairwise distances between different segments of the helices. Therefore, our choice of CVs in the string method and REUS are not biased against the rotation of the helices, as the reviewer assumed.

      (4) Convergence: String method convergence data does not show strong evidence for convergence (Figure S2) in my opinion. REUS convergence is also not discussed. No information is provided on the exchange rate or overlap between the windows.

      The convergence of string method, REUS, the exchange rate and overlap between windows will be discussed in the reviewed manuscript.

      Reviewer 3:

      The paper from Liang and Guan details the calculation of the potential mean force for the transition between two key states of the melibiose (Mel) transporter MelB. The authors used the string method along with replica-exchange umbrella sampling to model the transition between the outward and inwardfacing Mel-free states, including the binding and subsequent release of Mel. They find a barrier of ~6.8 kcal/mol and an overall free-energy difference of ~6.4 kcal/mol. They also investigate the same process without the co-transported Na+, finding a higher barrier, while in the D59C mutant, the barrier is nearly eliminated.

      For Na+ bound state, the rate-limiting barrier is 8.4 kcal/mol instead of 6.8 kcal/mol. The overall free energy difference is 3.7 kcal/mol instead of 6.4 kcal/mol. These numbers need to be corrected in the public review.

      I found this to be an interesting and technically competent paper. I was disappointed actually to see that the authors didn't try to complete the cycle. I realize this is beyond the scope of the study as presented.

      We agree with the reviewer that characterizing the complete cycle is our eventual goal. However, in order to characterize the complete cycle of the transporter, the free energy landscapes of the Na+ binding and unbinding process in the sugar-bound and unbound states, as well as the OF to IF conformational transition in the apo state. These additional calculations are expensive, and the amount of work devoted to these new calculations is estimated to be at least the same as the current study. Therefore, we prefer to carry out and analyze these new simulations in a future study.  

      The results are in qualitative agreement with expectations from experiments. Could the authors try to make this comparison more quantitative? For example, by determining the diffusivity along the path, the authors could estimate transition rates.

      In our revised manuscript, we will determine the diffusivity along the path and estimate transition rates.

      Relatedly, could the authors comment on how typical concentration gradients of Mel and Na+ would affect these numbers?

      The concentration gradient of Mel and Na+ can be varied in different experimental setups. In a typical active transport essay, the Na+ has a higher concentration outside the cell, and the melibiose has a higher concentration inside the cell. In the steady state, depending on the experiment setup, the extracellular Na+ concentration is in the range of 10-20 mM, and the intracellular concentration is self-balanced in the range of 3-4 mM due to the presence of other ion channels and pumps. In addition to the Na+ concentration gradient, there is also a transmembrane voltage potential of -200 mV (the intracellular side being more negative than the extracellular side), which facilitates the Na+ release into the intracellular side. In the steady state, the extracellular concentration of melibiose is ~0.4 mM, and the intracellular concentration is at least 1000 times the extracellular concentration, greater than 0.4 M. In this scenario, the free energy change of intracellular melibiose translocation will be increased by about ~5 kcal/mol at 300K temperature, leading to a total ∆𝐺 of ~8 kcal/mol. The total barrier for the melibiose translocation is expected to be increased by less than 5 kcal/mol. However, the increase in ∆𝐺 for intracellular melibiose translocation will be compensated by a decrease in ∆𝐺 of similar magnitude ( ~5 kcal/mol) for intracellular Na+ translocation. In a typical sugar self-exchange essay, there is no net gradient in the melibiose or Na+ across the membrane, and the overall free energy changes we calculated apply to this situation.

    1. Solidatech : Décryptage du programme et de ses offres pour les associations

      Introduction

      Ce document propose une analyse détaillée du webinaire "Solidatech, que propose-t-on et comment ça fonctionne?".

      Il s'articule autour des sections clés du webinaire, offrant des résumés concis et permettant une meilleure compréhension des services proposés par Solidatech aux associations.

      1. Présentation Générale de Solidatech (0:00 - 7:22)

      Solidatech : Un programme de solidarité numérique (0:00 - 2:50)

      Elodie, chargée de communication chez Solidatech, présente le programme de solidarité numérique qui vise à aider les associations à mieux utiliser le numérique pour accroître leur impact.

      Solidatech, créé en 2008, est porté par une équipe d'une quinzaine de personnes basée à Paris et dans les Deux-Sèvres.

      Le programme appartient aux Ateliers du Bocage, une coopérative d'insertion membre du mouvement Emmaüs spécialisée dans la collecte et le réemploi de matériel informatique.

      Solidatech fait partie du réseau international TechSoup qui regroupe des programmes similaires à travers le monde.

      Bénéficiaires et actions de Solidatech (2:50 - 6:00)

      Les associations loi 1901 constituent la majorité des bénéficiaires, incluant également des fondations, fonds de dotation et bibliothèques publiques.

      L'inscription au programme est gratuite, quel que soit le secteur d'activité, le mode de fonctionnement ou la taille de l'association.

      Solidatech agit sur trois axes principaux:

      Faciliter l'accès au numérique (logiciels à prix réduits et matériel informatique).

      Accompagner les associations dans le développement des usages du numérique (centre de ressources, diagnostics numériques, etc.).

      Coproduire et diffuser des savoirs sur la transition numérique des associations.

      Aperçu des offres logicielles et matérielles (6:00 - 7:22)

      Un large éventail de logiciels est proposé pour répondre aux divers besoins des associations : comptabilité, antivirus, outils collaboratifs, etc.

      Des éditeurs internationaux (Microsoft, Adobe, Zoom) et des solutions locales françaises (AssoConnect, Tolteck, NetExplorer) sont représentés.

      L'offre matérielle comprend principalement du matériel reconditionné par les Ateliers du Bocage, ainsi qu'une sélection de matériel neuf via des partenariats (Cisco, Dell).

      2. Navigation sur le nouveau site web Solidatech (7:22 - 19:50)

      Refonte du site Solidatech.fr et ses nouvelles fonctionnalités (7:22 - 10:35)

      Le nouveau site Solidatech.fr, lancé fin octobre, arbore un design modernisé et une ergonomie améliorée pour une navigation simplifiée.

      Il sert de porte d'entrée au programme Solidatech, centralisant les informations sur les offres, les ressources, l'actualité, les événements et d'autres informations utiles.

      Le site oriente les utilisateurs vers des plateformes secondaires en fonction de leurs besoins spécifiques.

      Plateformes partenaires : TexSoup.fr et Solidatech-ateliersdubocage.fr (10:35 - 15:45)

      L'accès aux logiciels à prix réduits et la commande de matériel sont désormais gérés sur des plateformes partenaires.

      TexSoup.fr, plateforme commune à tous les pays membres du réseau TechSoup, permet de proposer des réductions sur des logiciels majeurs (Adobe, Windows, Zoom).

      Solidatech-ateliersdubocage.fr, boutique dédiée, gère la vente de matériel reconditionné par les Ateliers du Bocage.

      Fluidité de navigation entre les plateformes (15:45 - 19:50)

      La navigation entre les plateformes Solidatech.fr, TexSoup.fr et Solidatech-ateliersdubocage.fr est fluide et intuitive.

      Des liens redirigeant vers la plateforme appropriée en fonction des besoins des utilisateurs sont intégrés au site.

      L'objectif est de simplifier le parcours utilisateur et de garantir une expérience cohérente.

      3. Catalogue des logiciels et solutions à tarifs réduits (19:50 - 30:25)

      Exploration du catalogue et critères d'éligibilité (19:50 - 24:15)

      Le catalogue en ligne présente plus de 260 produits, dont environ 200 logiciels et une soixantaine d'offres de matériel neuf.

      L'éligibilité aux offres dépend des critères définis par chaque éditeur/partenaire (budget, taille, secteur d'activité).

      La page de chaque produit détaille les modalités de l'offre Solidatech, les tarifs réduits, les restrictions d'éligibilité et le mode de fonctionnement.

      Système de coupons de réduction et frais administratifs (24:15 - 27:40)

      De nombreuses offres fonctionnent avec des coupons de réduction permettant d'obtenir une réduction sur l'abonnement.

      Des frais administratifs, variant selon l'offre, sont appliqués pour l'achat du coupon.

      Ces frais contribuent au fonctionnement et au développement du programme Solidatech.

      Vérification de l'éligibilité et mise au panier (27:40 - 30:25)

      Une fois connecté à son compte Solidatech sur TexSoup.fr, l'utilisateur peut vérifier son éligibilité aux offres.

      Un message indique si l'association peut ou non bénéficier de l'offre et ajouter le produit au panier.

      La connexion au compte facilite également la recherche d'offres spécifiques via la barre de recherche.

      1. Matériel reconditionné et matériel neuf (30:25 - 40:00)

      Boutique en ligne Solidatech-ateliersdubocage.fr (30:25 - 35:35)

      La boutique en ligne dédiée au matériel reconditionné propose des ordinateurs portables, des unités centrales, des smartphones, des tablettes et divers accessoires.

      Les produits reconditionnés par les Ateliers du Bocage sont garantis 12 mois et certifiés Service France Garantie.

      La création d'un compte client sur la boutique est requise pour passer commande, distincte du compte Solidatech.

      Jeton de validation et processus de commande (35:35 - 40:00)

      Un jeton de validation est nécessaire pour la création du compte client lors de la première commande.

      Le jeton est généré automatiquement sur la plateforme TexSoup.fr dans la section "Mon compte".

      Il permet de vérifier l'éligibilité de l'association et de finaliser la commande du matériel.

      5. Accompagnement et services (40:00 - 59:15)

      Formations certifiées Qualiopi (40:00 - 47:15)

      Solidatech, organisme de formation certifié Qualiopi, propose des formations thématiques pour sensibiliser aux enjeux du numérique.

      Différents thèmes sont abordés: bénévolat, communication, cybersécurité, stratégie numérique, travail collaboratif, etc.

      Les formations, destinées aux salariés et aux bénévoles, sont disponibles à distance, en présentiel ou en format hybride.

      Services de migration vers le cloud (47:15 - 50:10)

      Des services de migration vers le cloud sont proposés en partenariat avec IT for Life et WeScale Consulting, spécialistes du secteur associatif.

      Deux solutions sont disponibles: Google Workspace et Microsoft 365, incluant un accompagnement personnalisé pour le paramétrage des outils.

      Plateforme Prestatech (50:10 - 53:30)

      Prestatech, plateforme en ligne, recense des prestataires informatiques de confiance recommandés par d'autres associations.

      Des filtres permettent de trouver des prestataires par domaine d'expertise, modalité de financement, type d'intervention et région.

      Solidatech facilite la mise en relation, mais les associations gèrent directement la collaboration avec le prestataire choisi.

      Outil d'autodiagnostic numérique (53:30 - 57:25)

      Un outil d'autodiagnostic numérique gratuit est accessible en ligne pour évaluer la maturité numérique des associations.

      Organisé autour de sept piliers stratégiques, il permet d'identifier les points forts et faibles, et de construire une feuille de route numérique.

      Des ressources et solutions personnalisées sont suggérées en fonction des résultats du diagnostic.

      Centre de ressources et agenda des événements (57:25 - 59:15)

      Le centre de ressources propose un blog riche en articles, tutoriels, comparatifs d'outils, études, guides, replays de webinaires et conseils pratiques.

      L'agenda des événements informe des prochains webinaires, forums, appels à projets et autres événements organisés par Solidatech.

      Conclusion

      Le webinaire "Solidatech, que propose-t-on et comment ça fonctionne?" offre une présentation complète du programme et de ses nombreux services dédiés aux associations.

      Les informations détaillées sur les offres de logiciels, le matériel informatique, les formations, l'accompagnement personnalisé et les ressources disponibles permettent aux associations de mieux comprendre comment Solidatech peut les aider à optimiser leur utilisation du numérique et à renforcer leur impact.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer’s comments

      We are most grateful for the opportunity to address the reviewer comments. Point-by-point responses are presented below.

      Overall, the paper has several strengths, including leveraging large-scale, multi-modal datasets, using computational reasonable tools, and having an in-depth discussion of the significant results.

      We thank the reviewer for the very supportive comments.

      Based on the comments and questions, we have grouped the concerns and corresponding responses into three categories.

      (1) The scope and data selection

      The results are somewhat inconclusive or not validated.

      The overall results are carefully designed, but most of the results are descriptive. While the authors are able to find additional evidence either from the literature or explain the results with their existing knowledge, none of the results have been biologically validated. Especially, the last three result sections (signaling pathways, eQTLs, and TF binding) further extended their findings, but the authors did not put the major results into any of the figures in the main text.”

      The goal of this manuscript is to provide a list of putative childhood obesity target genes to yield new insights and help drive further experimentation. Moreover, the outputs from signaling pathways, eQTLs, and TF binding, although noteworthy and supportive of our method, were not particularly novel. In our manuscript we placed our focus on the novel findings from the analyses. We did, however, report the part of the eQTLs analysis concerning ADCY3, which brought new insight to the pathology of obesity, in Figure 4C.

      The manuscript would benefit from an explanation regarding the rationale behind the selection of the 57 human cell types analyzed. it is essential to clarify whether these cell types have unique functions or relevance to childhood development and obesity.

      We elected to comprehensively investigate the GWAS-informed cellular underpinnings of childhood development and obesity. By including a diverse range of cell types from different tissues and organs, we sought to capture the multifaceted nature of cellular contributions to obesity-related mechanisms, and open new avenues for targeted therapeutic interventions.

      There are clearly cell types that are already established as being key to the pathogenesis of obesity when dysregulated: adipocytes for energy storage, immune cell types regulating inflammation and metabolic homeostasis, hepatocytes regulating lipid metabolism, pancreatic cell types intricately involved in glucose and lipid metabolism, skeletal muscle for glucose uptake and metabolism, and brain cell types in the regulation of appetite, energy expenditure, and metabolic homeostasis.

      While it is practical to focus on cell types already proven to be associated with or relevant to obesity, this approach has its limitations. It confines our understanding to established knowledge and rules out the potential for discovering novel insights from new cellular mechanisms or pathways that could play significant roles in the pathogenesis if obesity. Therefore, it was essential to reflect known biology against the unexplored cell types to expand our overall understanding and potentially identify innovative targets for treatment or prevention.

      I wonder whether the used epigenome datasets are all from children. Although the authors use literature to support that body weight and obesity remain stable from infancy to adulthood, it remains uncertain whether epigenomic data from other life stages might overlook significant genetic variants that uniquely contribute to childhood obesity.

      The datasets utilized in our study were derived from a combination of sources, both pediatric and adult. We recognize that epigenetic profiles can vary across different life stages but our principal effort was to characterize susceptibility BEFORE disease onset.

      Given that the GTEx tissue samples are derived from adult donors, there appears to be a mismatch with the study's focus on childhood obesity. If possible, identifying alternative validation strategies or datasets more closely related to the pediatric population could strengthen the study's findings.

      We thank the reviewer for raising this important point. We acknowledge that the GTEx tissue samples are derived from adult donors, which might not perfectly align with the study's focus on childhood obesity. The ideal strategy would be a longitudinal design that follows individuals from childhood into adulthood to bridge the gap between pediatric and adult data, offering systematic insights into how early-life epigenetic markers influencing obesity later in life. In future work, we aim to carry out such efforts, which will represent substantial time and financial commitment.

      Along the same lines, the Developmental Genotype-Tissue Expression (dGTEx) Project is a new effort to study development-specific genetic effects on gene expression at 4 developmental windows spanning from infant to post-puberty (0-18 years). Donor recruitment began in August 2023 and remains ongoing. Tissue characterization and data production are underway. We hope that with the establishment of this resource, our future research in the field of pediatric health will be further enhanced.

      Figure 1B: in subplots c and d, the results are either from Hi-C or capture-C. Although the authors use different colors to denote them, I cannot help wondering how much difference between Hi-C and capture-C brings in. Did the authors explore the difference between the Hi-C and capture-C?

      Thank you for your comment. It is not within the scope of our paper to explore the differences between the Hi-C and Capture-C methods. In the context of our study, both methods serve the same purpose of detecting chromatin loops that bring putative enhancers to sometimes genomically distant gene promoters. Consequently, our focus was on utilizing these methods to identify relevant chromatin interactions rather than comparing their technical differences.

      (2) Details on defining different categories of the regions of interest

      Some technical details are missing.

      While the authors described all of their analysis steps, a lot of the time, they did not mention the motivation. Sometimes, the details were also omitted.”

      We have added a section to the revision to address the rationale behind different OCRs categories.

      Line 129: should "-1,500/+500bp" be "-500/+500bp"?

      A gene promoter was defined as a region 1,500 bases upstream to 500 bases downstream of the TSS. Most transcription factor binding sites are distributes upstream (5’) from TSS, and the assembly of transcription machinery occurs up to 1000 bases 5’ from TSS. Given our interest in SNPs that can potentially disrupt transcription factor binding, this defined promoter length allowed us to capture such SNPs in our analyses.

      How did the authors define a contact region?

      Chromatin contact regions identified by Hi-C or Capture-C assays are always reported as pairs of chromatin regions. The Supplementary eMethods provide details on the method of processing and interaction calling from the Hi-C and Capture-C data.

      The manuscript would benefit from a detailed explanation of the methods used to define cREs, particularly the process of intersecting OCRs with chromatin conformation data. The current description does not fully clarify how the cREs are defined.

      In the result section titled "Consistency and diversity of childhood obesity proxy variants mapped to cREs", the authors introduced the different types of cREs in the context of open chromatin regions and chromatin contact regions, and TSS. Figure 2A is helpful in some way, but more explanation is definitely needed. For example, it seems that the authors introduced three chromatin contacts on purpose, but I did not quite get the overall motivation.

      We apologize for the confusion. Our definition of cREs is consistent throughout the study. Figure 2A will be the first Figure 1A in the revision in order to aid the reader.

      The 3 representative chromatin loops illustrate different ways the chromatin contact regions (pairs of blue regions under blue arcs) can overlap with OCRs (yellow regions under yellow triangles – ATAC peaks) and gene promoters.

      (1) The first chromatin loop has one contact region that overlaps with OCRs at one end and with the gene promoter at the other. This satisfies the formation of cREs; thus, the area under the yellow ATAC-peak triangle is green.

      (2) The second loop only overlapped with OCR at one end, and there was no gene promoter nearby, so it is unqualified as cREs formation.

      (3) The third chromatin loop has OCR and promoter overlapping at one end. We defined this as a special cRE formation; thus, the area under the yellow ATAC-peak triangle is green.

      To avoid further confusion for the reader, we have eliminated this variation in the new illustration for the revised manuscript.

      Figure 2A: The authors used triangles filled differently to denote different types of cREs but I wonder what the height of the triangles implies. Please specify.

      The triangles are illustrations for ATAC-seq peaks, and the yellow chromatin regions under them are OCRs. The different heights of ATAC-seq peaks are usually quantified as intensity values for OCRs. However, in our study, when an ATAC-seq peak passed the significance threshold from the data pipeline, we only considered their locations, regardless of their intensities. To avoid further confusion for the reader, we have eliminated this variation in the new illustration for the revised manuscript.

      Figure 1B-c. the title should be "OCRs at putative cREs". Similarly in Figure 1B-d.

      cREs are a subset of OCRs.

      - In the section "Cell type specific partitioned heritability", the authors used "4 defined sets of input genomic regions". Are you corresponding to the four types of regions in Figure 2A? 

      Figure 2A is the first Figure 1A in the revision and is modified to showcase how we define OCRs and cREs.

      It seems that the authors described the 771 proxies in "Genetic loci included in variant-to-genes mapping" (ln 154), and then somehow narrowed down from 771 to 94 (according to ln 199) because they are cREs. It would be great if the authors could describe the selection procedure together, rather than isolated, which made it quite difficult to understand.

      In the Methods section entitled “Genetic loci included in variant-to-genes mapping," we described the process of LD expansion to include 771 proxies from 19 sentinel obesity-significantly associated signals. Not all of these proxies are located within our defined cREs. Figure 2B, now Figure 2A in the revision, illustrates different proportions of these proxies located within different types of regions, reducing the proxy list to 94 located within our defined cREs.

      Figure 2. What's the difference between the 771 and 758 proxies?

      13 out of 771 proxies did not fall within any defined regions. The remaining 758 were located within contact regions of at least one cell type regardless of chromatin state.

      (3) Typos

      In the paragraph "Childhood obesity GWAS summary statistics", the authors may want to describe the case/control numbers in two stages differently. "in stage 1" and "921 cases" together made me think "1,921" is one number.

      This has been amended in the revision.

      Hi-C technology should be spelled as Hi-C. There are many places, it is miss-spelled as "hi-C". In Figure 1, the author used "hiC" in the legend. Similarly, Capture-C sometime was spelled as "capture-C" in the manuscript.

      At the end of the fifth row in the second paragraph of the Introduction section: "exisit" should be "exist".

      In Figure 2A: "Within open chromatin contract region" should be "Within open chromatin contact region”

      These typos and terminology inconsistencies have been amended in the revision.

    1. Author response:

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Comment 1: In the Results section, the rationale behind selecting the beta band for the central (C3, CP3, Cz, CP4, C4) regions and the theta band for the fronto-central (Fz, FCz, Cz) regions is not clearly explained in the main text. This information is only mentioned in the figure captions. Additionally, why was the beta band chosen for the S-ROI central region and the theta band for the S-ROI fronto-central region? Was this choice influenced by the MVPA results?

      We thank the reviewer for the question regarding the rationale for the S-ROI selection in our study. The beta band was chosen for the central region due to its established relevance in motor control (Engel & Fries, 2010), movement planning (Little et al., 2019) and motor inhibition (Duque et al., 2017). The fronto-central theta band (or frontal midline theta) was a widely recognized indicator in cognitive control research (Cavanagh & Frank, 2014), associated with conflict detection and resolution processes. Moreover, recent empirical evidence suggested that the fronto-central theta reflected the coordination and integration between stimuli and responses (Senoussi et al., 2022). Although we have described the cognitive processes linked to these different frequencies in the introduction and discussion sections, along with the potential patterns of results observed in Stroop-related studies, we did not specify the involved cortical areas. Therefore, we have specified these areas in the introduction to enhance the clarity of the revised version (in the fourth paragraph of the Introduction section).

      Regarding whether the selection of S-ROIs was influenced by the MVPA results, we would like to clarify here that we selected the S-ROIs based on prior research and then conducted the decoding analysis. Specifically, we first extracted the data representing different frequency indicators (three F-ROIs and three S-ROIs) as features, followed by decoding to obtain the MVPA results. Subsequently, the time-frequency analysis, combined with the specific time windows during which each frequency was decoded, provided detailed interaction patterns among the variables for each indicator. The specifics of feature selection are described in the revised version (in the first paragraph of the Multivariate Pattern Analysis section).

      Comment 2: In the Data Analysis section, line 424 states: “Only trials that were correct in both the memory task and the Stroop task were included in all subsequent analyses. In addition, trials in which response times (RTs) deviated by more than three standard deviations from the condition mean were excluded from behavioral analyses.” The percentage of excluded trials should be reported. Also, for the EEG-related analyses, were the same trials excluded, or were different criteria applied?

      We thank the reviewer for this suggestion. Beyond the behavioral exclusion criteria, trials with EEG artifacts were also excluded from the data for the EEG-related analyses. We have now reported the percentage of excluded trials for both behavioral and EEG data analyses in the revised version (in the second paragraph of the EEG Recording and Preprocessing section and the first paragraph of the Behavioral Analysis section).

      Comment 3: In the Methods section, line 493 mentions: “A 400-200 ms pre-stimulus time window was selected as the baseline time window.” What is the justification in the literature for choosing the 400-200 ms pre-stimulus window as the baseline? Why was the 200-0 ms pre-stimulus period not considered?

      We thank the reviewer for this question and would like to provide the following justification. First, although a baseline ending at 0 ms is common in ERP analyses, it may not be suitable for time-frequency analysis. Due to the inherent temporal smoothing characteristic of wavelet convolution in time-frequency decomposition, task-related early activities can leak into the pre-stimulus period (before 0 ms) (Cohen, 2014). This means that extending the baseline to 0 ms will include some post-stimulus activity in the baseline window, thereby increasing baseline power and compromising the accuracy of the results. Second, an ideal baseline duration is recommended to be around 10-20% of the entire trial of interest (Morales & Bowers, 2022). In our study, the epoch duration was 2000 ms, making 200-400 ms an appropriate baseline length. Third, given that the minimum duration of the fixation point before the stimulus in our experiment was 400 ms, we chose the 400 ms before the stimulus as the baseline point to ensure its purity. In summary, considering edge effects, duration requirements, and the need to exclude other influences, we selected a baseline correction window of -400 to -200 ms. To enhance the clarity of the revised version, we have provided the rationale for the selected time windows along with relevant references (in the first paragraph of the Time-frequency analysis section).

      Comment 4: Is the primary innovation of this study limited to the methodology, such as employing MVPA and RSA to establish the relationship between late theta activity and behavior?

      We thank the reviewer for this insightful question and would like to clarify that our research extends beyond mere methodological innovation; rather, it utilized new methods to explore novel theoretical perspectives. Specifically, our research presents three levels of innovation: methodological, empirical, and theoretical. First, methodologically, MVPA overcame the drawbacks of traditional EEG analyses based on specific averaged voltage intensities, providing new perspectives on how the brain dynamically encoded particular neural representations over time. Furthermore, RSA aimed to identify which indicators among the decoded were directly related to behavioral representation patterns. Second, in terms of empirical results, using these two methods, we have identified for the first time three EEG markers that modulate the Stroop effect under verbal working memory load: SP, late theta, and beta, with late theta being directly linked to the elimination of the behavioral Stroop effect. Lastly, from a theoretical perspective, we proposed the novel idea that working memory played a crucial role in the late stages of conflict processing, specifically in the stimulus-response mapping stage (the specific theoretical contributions are detailed in the second-to-last paragraph of the Discussion section).

      Comment 5: On page 14, lines 280-287, the authors discuss a specific pattern observed in the alpha band. However, the manuscript does not provide the corresponding results to substantiate this discussion. It is recommended to include these results as supplementary material.

      We thank the reviewer for this suggestion. We added a new figure along with the corresponding statistical results that displayed the specific result patterns for the alpha band (Supplementary Figure 1).

      Comment 6: On page 16, lines 323-328, the authors provide a generalized explanation of the findings. According to load theory, stimuli compete for resources only when represented in the same form. Since the pre-memorized Chinese characters are represented semantically in working memory, this explanation lacks a critical premise: that semantic-response mapping is also represented semantically during processing.

      We thank the reviewer for this insightful suggestion. We fully agree with the reviewer’s perspective. As stated in our revised version, load theory suggests that cognitive resources are limited and dependent on a specific type (in the second paragraph of the Discussion section). The previously memorized Chinese characters are stored in working memory in the form of semantic representations; meanwhile the stimulus-response mapping should also be represented semantically, leading to resource occupancy. We have included this logical premise in the revised version (in the third-to-last paragraph of the Discussion section).

      Comment 7: The classic Stroop task includes both a manual and a vocal version. Since stimulus-response mapping in the vocal version is more automatic than in the manual version, it is unclear whether the findings of this study would generalize to the impact of working memory load on the Stroop effect in the vocal version.

      We fully agree with the reviewer’s point that the verbal version of the Stroop task differs from the manual version in terms of the degree of automation in the stimulus-response mapping. Specifically, the verbal version relies on mappings that are established through daily language use, while the manual version involves arbitrary mappings created in the laboratory. Therefore, the stimulus-response mapping in the verbal response version is more automated and less likely to be suppressed. However, our previous research indicated that the degree of automation in the stimulus-response mapping was influenced by practice (Chen et al., 2013). After approximately 128 practice trials, semantic conflict almost disappears, suggesting that the level of automation in stimulus-response mapping for the verbal Stroop task is comparable to that of the manual version (Chen et al., 2010). Given that participants in our study completed 144 practice trials (in the Procedure section), we believe these findings can be generalized to the verbal version.

      Comment 8: While the discussion section provides a comprehensive analysis of the study’s results, the authors could further elaborate on the theoretical and practical contributions of this work.

      We thank the reviewer for the constructive suggestions. We recognize that the theoretical and practical contributions of the study were not thoroughly elaborated in the original manuscript. Therefore, we have now provided a more detailed discussion. Specifically, the theoretical contributions focus on advancing load theory and highlighting the critical role of working memory in conflict processing. The practical contributions emphasize the application of load theory and the development of intervention strategies for enhancing inhibitory control. A more detailed discussion can be found in the revised version (in the second-to-last paragraph of the Discussion section).

      Reviewer #2 (Public review):

      Comment 1: As the researchers mentioned, a previous study reported a diminished Stroop effect with concurrent working memory tasks to memorize meaningless visual shapes rather than memorize Chinese characters as in the study. My main concern is that lower-level graphic processing when memorizing visual shapes also influences the Stroop effect. The stage of Stroop conflict processing affected by the working memory load may depend on the specific content of the concurrent working memory task. If that’s the case, I sense that the generalization of this finding may be limited.

      We thank the reviewer for this insightful concern. As mentioned in the manuscript, this may be attributed to the inherent characteristics of Chinese characters. In contrast to English words, the processing of Chinese characters relies more on graphemic encoding and memory (Chen, 1993). Therefore, the processing of line patterns essentially occupies some of the resources needed for character processing, which aligns with our study’s hypothesis based on dimensional overlap. Additionally, regarding the results, even though the previous study presents lower-level line patterns, the results still showed that the working memory load modulated the later theta band. We hypothesize that, regardless of the specific content of the pre-presented working memory load, once the stimulus disappears from view, these loads are maintained as representations in the working memory platform. Therefore, they do not influence early perceptual processing, and resource competition only occurs once the distractors reach the working memory platform. Lastly, previous study has shown that spatial loads, which do not overlap with either the target or distractor dimensions, do not influence conflict effect (Zhao et al., 2010). Taken together, we believe that regardless of the specific content of the concurrent working memory tasks, as long as they occupy resources related to irrelevant stimulus dimensions, they can influence the late-stage processing of conflict effect. Perhaps our original manuscript did not convey this clearly, so we have rephrased it in a more straightforward manner (in the second paragraph of the Discussion section).

      Comment 2: The P1 and N450 components are sensitive to congruency in previous studies as mentioned by the researchers, but the results in the present study did not replicate them. This raised concerns about data quality and needs to be explained.

      We thank the reviewer for this insightful concern. For P1, we aimed to convey that the early perceptual processing represented by P1 is part of the conflict processing process. Therefore, we included it in our analysis. Additionally, as mentioned in the discussion, most studies find P1 to be insensitive to congruency. However, we inappropriately cited a study in the introduction that suggested P1 shows differences in congruency, which is among the few studies that hold this perspective. To prevent confusion for readers, we have removed this citation from the introduction.

      As for N450, most studies have indeed found it to be influenced by congruency. In our manuscript, we did not observe a congruency effect at our chosen electrodes and time window. However, significant congruency effects were detected at other central-parietal electrodes (CP3, CP4, P5, P6) during the 350-500 ms interval. The interaction between task type and consistency remained non-significant, consistent with previous results. Furthermore, with respect to the location of the electrodes chosen, existing studies on N450 vary widely, including central-parietal electrodes and frontal-central electrodes (for a review, see Heidlmayr et al., 2020). We speculate that this phenomenon may be related to the extent of practice. With fewer total trials, the task may involve more stimulus conflicts, engaging more frontal brain areas. On the other hand, with more total trials, the task may involve more response conflicts, engaging more central-parietal brain areas (Chen et al., 2013; van Veen & Carter, 2005). Due to the extensive practice required in our study, we identified a congruency N450 effect in the central-parietal region. We apologize for not thoroughly exploring other potential electrodes in the previous manuscript, and we have revised the results and interpretations regarding N450 accordingly in the revised version (in the N450 section of the ERP results and the third paragraph of the Discussion section).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Comment 1: In the Introduction, line 108 states: “Second, alpha oscillations (8-13 Hz) can serve as a neural inverse index of mental activity or alertness, while a decrease in alpha power reflects increased alertness or enhanced attentional inhibition of distractors (Arakaki et al., 2022; Tafuro et al., 2019; Zhou et al., 2023; Zhu et al., 2023).” Please clarify which specific psychological process related to conflict processing is reflected by alpha oscillations.

      We appreciate your suggestion and we have clearly highlighted the role of alpha oscillations in attentional engagement during conflict processing in the revised version (in the third-to-last paragraph of the introduction).

      Comment 2: In Figures 3C and 3E, a space is needed between “amplitude” and the preceding parenthesis. Similar adjustments are required in Figures 4A, 4B, 4C, 5C, and 6C. Additionally, in Figures 3B and 3D, a space should be added between the numbers and “ms.” This issue also appears in Figure 8. Please review all figures for these formatting inconsistencies.

      We apologize for the inconsistency in formatting and have corrected them throughout the revised version.

      Comment 3: There are some clerical errors in the manuscript that need correction. For instance, on page 19, line 403: “Participants were asked to answer by pressing one of two response buttons (“S with the left ring finger and “L” with the left ring finger).” This should be corrected to: “L” with the right ring finger. I recommend that the authors carefully proofread the manuscript to identify and correct such errors.

      We sincerely apologize for the errors present in the manuscript and have now carefully proofread it (in the Procedure section).

      Comment 4: On page 13, line 254, the elimination of the Stroop effect should not be interpreted as an improvement in processing.

      We greatly appreciate your suggestion. We agree that the elimination of the Stroop effect should not be confused with improvements in processing. We have corrected this in the revised version (the second paragraph of the Discussion section).

      Reviewer #3 (Recommendations for the authors):

      Comment 1: In the introduction section, the N450 was introduced as “a frontal-central negative deflection”, but in the methods part the N450 was computed using central-parietal electrodes. This inconsistency is confusing and needs to be clarified.

      We apologize for this confusion. We have provided a detailed explanation regarding the differences in electrodes and the rationale behind choosing central-parietal electrodes in our response to Reviewer 2’s second comment. To clarify, we have updated the introduction to consistently label them as central-parietal deflections (in the third paragraph of the Introduction section).

      Comment 2: I speculate the “beta” was mistakenly written as “theta” in line 212.

      We sincerely apologize for this mistake. We have corrected this error (in the RSA results section).

      Comment 3: The speculation that “changes in beta bands may be influenced by theta bands, thereby indirectly influencing the behavioral Stroop effect” needs to be rationalized.

      We appreciate your suggestion. What we intended to convey is that we found an interaction effect in the beta bands; however, the RSA results did not show a correlation with the behavioral interaction effect. We speculate that beta activity might be influenced by the theta bands. On the one hand, we realize that the idea of beta bands indirectly influencing the behavioral Stroop effect was inappropriate, and we have removed this point in the revised version. On the other hand, we have provided rational evidence for the idea that beta bands may be influenced by theta bands. This is based on the biological properties of theta oscillations, which support communication between different cortical neural signals, and their functional role in integrating and transmitting task-relevant information to response execution (in the third-to-last paragraph of the Discussion section).

      Comment 4: Typo in line 479: [10,10].

      We sincerely apologize for this mistake. We have corrected this error: [-10,10] (in the Multivariate pattern analysis section).

      Reference

      Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414–421. https://doi.org/10.1016/j.tics.2014.04.012

      Chen, M. J. (1993). A Comparison of Chinese and English Language Processing. In Advances in Psychology (Vol. 103, pp. 97–117). North-Holland. https://doi.org/10.1016/S0166-4115(08)61659-3

      Chen, X. F., Jiang, J., Zhao, X., & Chen, A. (2010). Effects of practice on semantic conflict and response conflict in the Stroop task. Psychol. Sci., 33, 869–871.

      Chen, Z., Lei, X., Ding, C., Li, H., & Chen, A. (2013). The neural mechanisms of semantic and response conflicts: An fMRI study of practice-related effects in the Stroop task. NeuroImage, 66, 577–584. https://doi.org/10.1016/j.neuroimage.2012.10.028

      Cohen, M. X. (2014). Analyzing Neural Time Series Data: Theory and Practice. The MIT Press. https://doi.org/10.7551/mitpress/9609.001.0001

      Duprez, J., Gulbinaite, R., & Cohen, M. X. (2020). Midfrontal theta phase coordinates behaviorally relevant brain computations during cognitive control. NeuroImage, 207, 116340. https://doi.org/10.1016/j.neuroimage.2019.116340

      Duque, J., Greenhouse, I., Labruna, L., & Ivry, R. B. (2017). Physiological Markers of Motor Inhibition during Human Behavior. Trends in Neurosciences, 40(4), 219–236. https://doi.org/10.1016/j.tins.2017.02.006

      Engel, A. K., & Fries, P. (2010). Beta-band oscillations—Signalling the status quo? Current Opinion in Neurobiology, 20(2), 156–165. https://doi.org/10.1016/j.conb.2010.02.015

      Heidlmayr, K., Kihlstedt, M., & Isel, F. (2020). A review on the electroencephalography markers of Stroop executive control processes. Brain and Cognition, 146, 105637. https://doi.org/10.1016/j.bandc.2020.105637

      Little, S., Bonaiuto, J., Barnes, G., & Bestmann, S. (2019). Human motor cortical beta bursts relate to movement planning and response errors. PLOS Biology, 17(10), e3000479. https://doi.org/10.1371/journal.pbio.3000479

      Morales, S., & Bowers, M. E. (2022). Time-frequency analysis methods and their application in developmental EEG data. Developmental Cognitive Neuroscience, 54, 101067. https://doi.org/10.1016/j.dcn.2022.101067

      Senoussi, M., Verbeke, P., Desender, K., De Loof, E., Talsma, D., & Verguts, T. (2022). Theta oscillations shift towards optimal frequency for cognitive control. Nature Human Behaviour, 6(7), Article 7. https://doi.org/10.1038/s41562-022-01335-5

      van Veen, V., & Carter, C. S. (2005). Separating semantic conflict and response conflict in the Stroop task: A functional MRI study. NeuroImage, 27(3), 497–504. https://doi.org/10.1016/j.neuroimage.2005.04.042

      Zhao, X., Chen, A., & West, R. (2010). The influence of working memory load on the Simon effect. Psychonomic Bulletin & Review, 17(5), 687–692. https://doi.org/10.3758/PBR.17.5.687

    1. eatures that firstappeared in NLS

      features that first appeared in NLS: - The mouse - 2 dimensional display editing and windows - Hypermedia with integrated e-mail - Document version control - in-file object addressing and linking - Outline processing - Shared screen teleconferencing - Context sensitive help - Distributed client-server architecture - Extensible, user-programmable tools

    1. le shell Bash pour Windows, que vous pouvez installer en suivant les instructions officielles de Microsoft (en français).

      Le lien renvoie à une page sur "Comment installer Linux sur Windows avec WSL" est-ce la même chose que shell Bash ?

    1. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      The paper by Chen et al describes the role of neuronal themo-TRPV3 channels in the firing of cortical neurons at a fever temperature range. The authors began by demonstrating that exposure to infrared light increasing ambient temperature causes body temperature to rise to a fever level above 38{degree sign}C. Subsequently, they showed that at the fever temperature of 39{degree sign}C, the spike threshold (ST) increased in both populations (P12-14 and P7-8) of cortical excitatory pyramidal neurons (PNs). However, the spike number only decreased in P7-8 PNs, while it remained stable in P12-14 PNs at 39 degrees centigrade. In addition, the fever temperature also reduced the late peak postsynaptic potential (PSP) in P12-14 PNs. The authors further characterized the firing properties of cortical P12-14 PNs, identifying two types: STAY PNs that retained spiking at 30{degree sign}C, 36{degree sign}C, and 39{degree sign}C, and STOP PNs that stopped spiking upon temperature change. They further extended their analysis and characterization to striatal medium spiny neurons (MSNs) and found that STAY MSNs and PNs shared the same ST temperature sensitivity. Using small molecule tools, they further identified that themo-TRPV3 currents in cortical PNs increased in response to temperature elevation, but not TRPV4 currents. The authors concluded that during fever, neuronal firing stability is largely maintained by sensory STAY PNs and MSNs that express functional TRPV3 channels. Overall, this study is well designed and executed with substantial controls, some interesting findings, and quality of data. Here are some specific comments: 

      (1) Could the authors discuss, or is there any evidence of, changes in TRPV3 expression levels in the brain during the postnatal 1-4 week age range in mice? 

      To our knowledge, no published studies have documented changes in TRPV3 expression levels in the brain during the 1st to 4th postnatal weeks in mice. Research on TRPV3 expression in the mouse brain has primarily involved RT-PCR analysis of RNA from dissociated tissue in adult mice (Jang et al., 2012; Kumar et al., 2018), largely due to the scarcity of effective antibodies for brain tissue sections at the time of publication. Furthermore, the Allen Brain Atlas lacks data on TRPV3 expression in the developing or postnatal brain. To address this gap, we plan to examine TRPV3 expression at P7-8, P12-13, and P20-23 as part of our manuscript revision.

      (2) Are there any differential differences in TRPV3 expression patterns that could explain the different firing properties in response to fever temperature between the STAY- and STOP neurons? 

      This is an excellent question and one we plan to explore in the future by developing reporter mice or viral tools to monitor the activity of cells with endogenous TRPV3 expression. To our knowledge, these tools do not currently exist. Creating them will be challenging, as it requires identifying promoters that accurately reflect endogenous TRPV3 expression. We have not yet quantified TRPV3 expression in STOP and STAY neurons; however, our analysis of evoked spiking activity at 30, 36, and 39°C suggests that TRPV3 expression may mark a population of pyramidal neurons that tend to STAY spiking as temperatures increase. To investigate this further, we are considering patch-seq for TRPV3 expression on recorded neurons. This is a complex experiment, as it requires recording activity at three different temperatures and subsequently collecting the cell contents. While success is not guaranteed, we are committed to attempting these experiments as part of our revisions.

      (3) TRPV3 and TRPV4 can co-assemble to form heterotetrameric channels with distinct functional properties. Do STOP neurons exhibit any firing behaviors that could be attributed to the variable TRPV3/4 assembly ratio? 

      There is some evidence that TRPV3 and TRPV4 proteins can physically associate in HEK293 cells and native skin tissues (Hu et al., 2022).  TRPV3 and TRPV4 are both expressed in the cortex (Kumar et al., 2018), but it remains unclear whether they are co-expressed and co-assembled to form heteromeric channels in cortical excitatory  pyramidal neurons.  Examination of the I-V curve from HEK cells co-expressing TRPV3/4 heteromeric channels shows enhanced current at negative membrane potentials (Hu et al., 2022).  

      Currently, we cannot characterize cells as STOP or STAY and measure TRPV3 or TRPV4 currents simultaneously, as this would require different experimental setups and internal solutions. Additionally, the protocol involves a sequence of recordings at 30, 36, and 39°C, followed by cooling back to 30°C and re-heating to each temperature. Cells undergoing such a protocol will likely not survive till the end.

      In our recordings of TRPV3 currents—which likely include both STOP and STAY cells—we do not observe a significant current at negative voltages, suggesting that TRPV3/4 heteromeric channels may either be absent or underrepresented, at least at a 1:1 ratio. However, the possibility that TRPV3/4 heteromeric channels could define the STOP cell population is intriguing and plausible.

      (4) In Figure 7, have the authors observed an increase of TRPV3 currents in MSNs in response to temperature elevation? 

      We have not recorded TRPV3 currents in MSNs in response to elevated temperatures.

      (5) Is there any evidence of a relationship between TRPV3 expression levels in D2+ MSNs and degeneration of dopamine-producing neurons? 

      This is an interesting question, though it falls outside our current research focus in the lab. A PubMed search yields no results connecting the terms TRPV3, MSNs, and degeneration. However, gain-of-function mutations in TRPV4 channel activity have been implicated in motor neuron degeneration (Sullivan et al., 2024) and axon degeneration (Woolums et al., 2020). Similarly, TRPV1 activation has been linked to developmental axon degeneration (Johnstone et al., 2019), while TRPV3 blockade has shown neuroprotective effects in models of cerebral ischemia/reperfusion injury in mice (Chen et al., 2022).

      The link between TRPV activation and cell degeneration, however, may not be straightforward. For instance, TRPV1 loss has been shown to accelerate stress-induced degradation of axonal transport from retinal ganglion cells to the superior colliculus and to cause degeneration of axons in the optic nerve (Ward et al., 2014). Meanwhile, TRPV1 activation by capsaicin preserves the survival and function of nigrostriatal dopamine neurons in the MPTP mouse model of Parkinson's disease (Chung et al., 2017).

      (6) Does fever range temperature alter the expressions of other neuronal Kv channels known to regulate the firing threshold? 

      This is an active line of investigation in our lab. The results of ongoing experiments will provide further insight into this question.

      Reviewer #2 (Public review): 

      Summary: 

      The authors study the excitability of layer 2/3 pyramidal neurons in response to layer four stimulation at temperatures ranging from 30 to 39 Celsius in P7-8, P12-P14, and P22-P24 animals. They also measure brain temperature and spiking in vivo in response to externally applied heat. Some pyramidal neurons continue to fire action potentials in response to stimulation at 39 C and are called stay neurons. Stay neurons have unique properties aided by TRPV3 channel expression. 

      Strengths: 

      The authors use various techniques and assemble large amounts of data. 

      Weaknesses: 

      (1) No hyperthermia-induced seizures were recorded in the study. 

      The goal of this manuscript is to uncover the age-related physiological changes that enable the brain to retain function at fever temperatures. These changes may potentially explain why most children do not experience febrile seizures or why, in the rare cases when they do occur, the most prominent window of susceptibility is between 2-5 years of age (Shinnar and O’Dell, 2004), as this may coincide with the window during which these developmental changes are normally occurring. While it is possible that impairments in these mechanisms could result in febrile seizures, another possibility is that neural activity may fall below the level required to maintain normal function.

      (2) Febrile seizures in humans are age-specific, extending from 6 months to 6 years. While translating to rodents is challenging, according to published literature (see Baram), rodents aged P11-16 experience seizures upon exposure to hyperthermia. The rationale for publishing data on P7-8 and P22-24 animals, which are outside this age window, must be clearly explained to address a potential weakness in the study. 

      This manuscript focuses on identifying the age-related physiological changes that enable the brain to retain function at fever temperatures. To this end, we examine two age periods flanking the putative window of susceptibility (P12-14), specifically an earlier timepoint (P7-8) and a later timepoint (P20-23). The inclusion of these time points also serves as a negative control, allowing us to determine whether the changes we observe in the proposed window of susceptibility are unique to this period. We believe that including these windows ensures a thorough and objective scientific approach.

      (3) Authors evoked responses from layer 4 and recorded postsynaptic potentials, which then caused action potentials in layer 2/3 neurons in the current clamp. The post-synaptic potentials are exquisitely temperature-sensitive, as the authors demonstrate in Figures 3 B and 7D. Note markedly altered decay of synaptic potentials with rising temperature in these traces. The altered decays will likely change the activation and inactivation of voltage-gated ion channels, adjusting the action potential threshold. 

      In Figure 4B, we surmised that the temperature-induced reductions in inhibition and the subsequent loss of the late PSP primarily contribute to the altered decay of the synaptic potentials.

      (4) The data weakly supports the claim that the E-I balance is unchanged at higher temperatures. Synaptic transmission is exquisitely temperature-sensitive due to the many proteins and enzymes involved. A comprehensive analysis of spontaneous synaptic current amplitude, decay, and frequency is crucial to fully understand the effects of temperature on synaptic transmission. 

      Thank you for the opportunity to provide clarification. It was not stated, nor did we intend to imply, that in general, E-I balance is unchanged at higher temperatures. Please see the excerpt from the manuscript below. The statements specifically referred to observations made for experiments conducted during the P20-26 age range for cortical pyramidal neurons. We have a parallel line of investigation exploring the differential susceptibility of E-I balance based on age and temperature. Additionally, our measurements focus on evoked activity, rather than spontaneous activity, as these events are more likely linked to the physiological changes underlying behavior in the sensory cortex.

      “As both excitatory and inhibitory PNs that stay spiking increase their firing rates (Figure 5B) and considering that some neurons within the network are inactive throughout or stop spiking, it is plausible that these events are calibrated such that despite temperature increases, the excitatory to inhibitory (E-I) balance within the circuit may remain relatively unchanged. Indeed, recordings of L4-evoked excitatory and inhibitory postsynaptic currents (respectively EPSCs and IPSCs) in wildtype L2/3 excitatory PNs in S1 cortex, where inhibition is largely mediated by the parvalbumin positive (PV) interneurons, showed that E-I balance (defined as E/E+I, the ratio of the excitatory current to the total current) remained unchanged as temperature increased from 36 to 39°C (Figure 5E).”

      (5) It is unclear how the temperature sensitivity of medium spiny neurons is relevant to febrile seizures. Furthermore, the most relevant neurons are hippocampal neurons since the best evidence from human and rodent studies is that febrile seizures involve the hippocampus. 

      Thank you for the opportunity to clarify. Our goal was not to establish a link between medium spiny neuron (MSN) function and febrile seizures. The manuscript's focus is on identifying age-related physiological changes that enable supragranular cortical cells in the brain to retain function at fever temperatures. MSNs were selected for mechanistic comparison in this study because they represent a non-pyramidal, non-excitatory neuronal subtype, allowing us to assess whether the physiological changes observed in L2/3 excitatory pyramidal neurons are unique to these cells.

      (6) TRP3V3 data would be convincing if the knockout animals did not have febrile seizures. 

      Could you kindly provide the reference indicating that TRPV3 KO mice have seizures? Unfortunately, we were unable to locate this reference. It is important to distinguish febrile seizures, which occur within the range of physiological body temperatures (~ 38 to 40°C), from seizures resulting from heat stroke, a severe form of hyperthermia occuring when body temperature exceeds 40.0 °C. Mechanistically, these may represent different phenomena, as the latter is typically associated with widespread protein denaturation and cell death, whereas febrile seizures are usually non-lethal.  Additionally, TRPV3 is located on chromosome 17p13.2, a region not currently associated with seizure susceptibility.

      Reviewer #3 (Public review): 

      Summary: 

      This important study combines in vitro and in vivo recording to determine how the firing of cortical and striatal neurons changes during a fever range temperature rise (37-40 oC). The authors found that certain neurons will start, stop, or maintain firing during these body temperature changes. The authors further suggested that the TRPV3 channel plays a role in maintaining cortical activity during fever. 

      Strengths: 

      The topic of how the firing pattern of neurons changes during fever is unique and interesting. The authors carefully used in vitro electrophysiology assays to study this interesting topic. 

      Weaknesses: 

      (1) In vivo recording is a strength of this study. However, data from in vivo recording is only shown in Figures 5A,B. This reviewer suggests the authors further expand on the analysis of the in vivo Neuropixels recording. For example, to show single spike waveforms and raster plots to provide more information on the recording. The authors can also separate the recording based on brain regions (cortex vs striatum) using the depth of the probe as a landmark to study the specific firing of cortical neurons and striatal neurons. It is also possible to use published parameters to separate the recording based on spike waveform to identify regular principal neurons vs fast-spiking interneurons. Since the authors studied E/I balance in brain slices, it would be very interesting to see whether the "E/I balance" based on the firing of excitatory neurons vs fast-spiking interneurons might be changed or not in the in vivo condition. 

      As requested, in the revised manuscript, we will include examples of single spike waveforms and raster plots for the in vivo recordings. Please note that all recordings were conducted in the cortex, not the striatum. To clarify, we used published parameters to separate the recordings based on spike waveform, which allowed us to identify regular principal neurons and fast-spiking interneurons. The paragraph below from the methods section describes this procedure.

      “ Following manual curation, based on their spike waveform duration, the selected single units (n= 633) were separated into putative inhibitory interneurons and excitatory principal cells (Barthóet al., 2004). The spike duration was calculated as the time difference between the trough and the subsequent waveform peak of the mean filtered (300 – 6000 Hz bandpassed) spike waveform. Durations of extracellularly recorded spikes showed a bimodal distribution (Hartigan’s dip test; p < 0.001) characteristic of the neocortex with shorter durations corresponding to putative interneurons (narrow spikes) and longer durations to putative principal cells (wide spikes). Next, k-means clustering was used to separate the single units into these two groups, which resulted in 140 interneurons (spike duration < 0.6 ms) and 493 principal cells (spike duration > 0.6 ms), corresponding to a typical 22% - 78% (interneuron – principal) cell ratio”.

      In vivo patching to record extracellular and inhibitory responses at 36°C and then waiting 10 minutes to record again at 39°C would be an extremely challenging experiment. Due to the high difficulty and expected very low yield, these experiments will not be pursued for the revision studies.

      (2) The author should propose a potential mechanism for how TRPV3 helps to maintain cortical activity during fever. Would calcium influx-mediated change of membrane potential be the possible reason? Making a summary figure to put all the findings into perspective and propose a possible mechanism would also be appreciated. 

      Thank you for your helpful suggestions. In response to your recommendation, we will include a summary figure detailing the hypothesis currently described in the discussion section of the manuscript. The excerpt from the discussion is included below.

      “Although, TRPV3 channels are cation-nonselective, they exhibit high permeability to Ca2+ (Ca²⁺ > Na⁺ ≈ K⁺ ≈ Cs⁺) with permeability ratios (relative to Na+) of 12.1, 0.9, 0.9, 0.9 (Xu et al., 2002). Opening of TRPV3 channels activates a nonselective cationic conductance and elevates membrane depolarization, which can increase the likelihood of generating action potentials. Indeed, our observations of a loss of the temperature-induced increases in the PSP with TRPV3 blockade are consistent with a reduction in membrane depolarization. In S1 cortical circuits at P12-14, STAY PNs appear to rely on a temperature-dependent activity mechanism, where depolarization levels (mediated by higher excitatory input and lower inhibitory input) are scaled to match the cell’s ST. Thus, an inability to increase PSPs with temperature elevations prevents PNs from reaching ST, so they cease spiking.”

      (3) The author studied P7-8, P12-14, and P20-26 mice. How do these ages correspond to the human ages? it would be nice to provide a comparison to help the reader understand the context better.

      Ideally, the mouse-human age comparison would depend on the specific process being studied. Please note that these periods are described in the introduction of the manuscript. The relevant excerpt is included below. Let us know if you need any additional modifications to this description.

      “Using wildtype mice across three postnatal developmental periods—postnatal day (P)7-8 (neonatal/early), P12-14 (infancy/mid), and P20-26 (juvenile/late)—we investigated the electrophysiological properties, ex vivo and in vivo, that enable excitatory pyramidal neurons (PNs) neurons in mouse primary somatosensory (S1) cortex to remain active during temperature increases from 30°C (standard in electrophysiology studies) to 36°C (physiological temperature), and then to 39°C (fever-range).”

    1. Reviewer #2 (Public review):

      Summary:

      In this work, the investigators isolated one Lacticaseibacillus rhamnosus strain (P118), and determined this strain worked well against Salmonella Typhimurium infection. Then, further studies were performed to identify the mechanism of bacterial resistance, and a list of confirmatory assays was carried out to test the hypothesis.

      Strengths:

      The authors provided details regarding all assays performed in this work, and this reviewer trusted that the conclusion in this manuscript is solid. I appreciate the efforts of the authors to perform different types of in vivo and in vitro studies to confirm the hypothesis.

      Weaknesses:

      I have two main questions about this work.

      (1) The authors provided the below information about the sources from which Lacticaseibacillus rhamnosus was isolated. More details are needed. What are the criteria to choose these samples? Where did these samples originate from? How many strains of bacteria were obtained from which types of samples?

      Lines 486-488: Lactic acid bacteria (LAB) and Enterococcus strains were isolated from the fermented yoghurts collected from families in multiple cities of China and the intestinal contents from healthy piglets without pathogen infection and diarrhoea by our lab.

      Lines 129-133: A total of 290 bacterial strains were isolated and identified from 32 samples of the fermented yoghurt and piglet rectal contents collected across diverse regions within China using MRS and BHI medium, which consist s of 63 Streptococcus strains, 158 Lactobacillus/ Lacticaseibacillus Limosilactobacillus strains, and 69 Enterococcus strains.

      (2) As a probiotic, Lacticaseibacillus rhamnosus has been widely studied. In fact, there are many commercially available products, and Lacticaseibacillus rhamnosus is the main bacteria in these products. There are also ATCC type strains such as 53103.

      I am sure the authors are also interested to know whether P118 is better as a probiotic candidate than other commercially available strains. Also, would the mechanism described for P118 apply to other Lacticaseibacillus rhamnosus strains?

      It would be ideal if the authors could include one or two Lacticaseibacillus rhamnosus which are currently commercially used, or from the ATCC. Then, the authors can compare the efficacy and antibacterial mechanisms of their P118 with other strains. This would open the windows for future work.

    1. Last Modified on 10/26/2023 3:41 pm EDT stages® stores Alarm History on all sites. In the

      There's some weird hyperlinking error happening in this highlighted text. Just looking at the source code in Chrome's developer tool, I can't figure out how/why that, at least for me, summons the Windows print preview dialog, is being applied so frequently.

    1. l’endroit où vous vous trouvez

      Je n'arrive pas à modifier l'emplacement de départ. Selon l'invite de commande que j'ouvre, je suis sur Users\Myname> ou WINDOWS\system32

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Comment 1. In Figure 1, the MafB antibody (Sigma) was used to identify Renshaw cells at P5. However, according to the supplementary Figure 3D, the specificity of the MafB antibody (Sigma) is relatively low. The image of MafB-GFP, V1-INs, and MafB-IR at P5 should be added to the supplementary figure. The specificity of MaFB-IR-Sigma in V1 neurons at P5 should be shown. This image also might support the description of the genetically labeled MafB-V1 distribution at P5 (page 8, lines 28-32). 

      We followed the reviewer’s suggestion and moved analyses of the MafB-GFP mouse to a supplemental figure (Fig S3). The characterization of MafB immunoreactivities is now in supplemental Figure S2 and the related text in results was also moved to supplemental to reduce technicalities in the main text. We added confocal images of MafB-GFP V1 interneurons at P5 showing immunoreactivities for both MafB antibodies, as suggested by the reviewer (Fig S2A,B). We agree with the reviewer that this strengthens our comparisons on the sensitivity and specificity of the two MafB antibodies used in this study. 

      As explained in the preliminary response we cannot show lack of immunoreactivity for MafB antibodies in MafB GFP/GFP knockout mice at P5 because MafB global KOs die at birth. This is why we used tissues from late embryos to check MafB immunoreactivities (Figure S2C and S2D). We made this point clearer in the text and supplemental figure legends.

      Comment 2. The proportion of genetically labeled FoxP2-V1 in all V1 is more than 60%, although immunolabeled FoxP2-V1 is approximately 30% at P5. Genetically labeled Otp-V1 included other nonFoxP2 V1 clades (Fig. 8L-M). I wonder whether genetically labeled FoxP2-V1 might include the other three clades. The authors should show whether genetically labeled FoxP2-V1 expresses other clade markers, such as pou6f2, sp8, and calbindin, at P5. 

      We included the requested data in Figure 3E-G. Lineage-labeled Foxp2-V1 neurons in our genetic intersection do not include cells from other V1-clades.

      Reviewer 2:

      Comment 1. The current version of the paper is VERY hard to read. It is often extremely difficult to "see the forest for the trees" and the reader is often drowned in methodological details that provide only minor additions to the scientific message. Non-specialists in developmental biology, but still interested in the spinal cord organization, especially students, might find this article challenging to digest and there is a high risk that they will be inclined to abandon reading it. The diversity of developmental stages studied (with possible mistakes between text and figures) adds a substantial complexity in the reading. It is also not clear at all why authors choose to focus on the Foxp2 V1 from page 9. Naively, the Pou6f2 might have been equally interesting. Finally, numerous discrepancies in the referencing of figures must also be fixed. I strongly recommend an in-depth streamlining and proofreading, and possibly moving some material to supplement (e.g. page 8, and elsewhere).

      The whole text was re-written and streamlined with most methodological discussion (including the section referred to by the reviewer) transferred to supplemental data. Nevertheless, enough details on samples, stats and methods were retained to maintain the rigor of the manuscript. 

      The reasons justifying a focus on Foxp2-V1 interneurons were fully explained in our preliminary response. Briefly, we are trying to elucidate V1 heterogeneity, and prior data showed that this is the most heterogeneous V1 clade (Bikoff et al., 2016), so it makes sense it was studied further. We agree that the Pou6f2 clade is equally interesting and is in fact the subject of several ongoing studies.

      Comment 2. … although the different V1 populations have been investigated in detail regarding their development and positioning, their functional ambition is not directly investigated through gain or loss of function experiments. For the Foxp2-V1, the developmental and anatomical mapping is complemented by a connectivity mapping (Fig 6s, 8), but the latter is fairly superficial compared to the former. Synapses (Fig 6) are counted on a relatively small number of motoneurons per animal, that may, or may not, be representative of the population. Likewise, putative synaptic inputs are only counted on neuronal somata. Motoneurons that lack of axo-somatic contacts may still be contacted distally. Hence, while this data is still suggestive of differences between V1 pools, it is only little predictive of function.

      We fully answered the question on functional studies in the preliminary response. Briefly, we are currently conducting these studies using various mouse models that include chronic synaptic silencing using tetanus toxin, acute partial silencing using DREADDs, and acute cell deletion using diphtheria toxin. Each intervention reveals different features of Foxp2-V1 interneuron functions, and each model requires independent validation. Moreover, these studies are being carried out at three developmental stages: embryos, early postnatal period of locomotor maturation and mature animals. Obviously, this is all beyond the goals and scope of the present study. The present study is however the basis for better informed interpretations of results obtained in functional studies.

      Regarding the question on synapse counts, we explained in the preliminary results fully why we believe our experimental designs for synapse counting at the confocal level are among the most thorough that can be found in the literature. We counted a very large number of motoneurons per animal when adding all motor column and segments analyzed in each animal. Statistical power was also enough to detect fundamental variation in synaptic density among motor columns.

      We focus our analyses on motoneuron cells bodies because analysis of full dendritic arbors on all motor columns present throughout all lumbosacral segments is not feasible. Please see Rotterman et al., 2014 (J. of Neuroscience; doi: 10.1523/JNEUROSCI.4768-13.2014) for evaluation of what this entails for a single motoneuron. We agree with the reviewer that analyses of V1 synapses over full dendrite arbors in specific motoneurons will be very relevant in further studies. These should be carried out now that we know which motor columns are of high interest. Nevertheless, inhibitory synapses exert the most efficient modulation of neuronal firing when they are on cell bodies, and our analyses clearly suggest a difference in in cell body inhibitory synapses targeting between different V1 interneuron types that we find very relevant.

      Comment 3. I suggest taking with caution the rabies labelling (Figure 8). It is known that this type of Rabies vectors, when delivered from the periphery, might also label sensory afferents and their postsynaptic targets in the cord through anterograde transport and transneuronal spread (e.g., Pimpinella et al., 2022). Yet I am not sure authors have made all controls to exclude that labelled neurons, presumed here to be premotoneurons, could rather be anterogradely labelled from sensory afferents. 

      Over the years, we performed many extensive controls and validation of rabies virus transsynaptic tracing methods. These were presented at two SfN meetings (Gomez-Perez et al., 2015 and 2016; Program Nos. 242.08 and 366.06). Our validation of this technique was fully explained in our preliminary response. We also pointed out that the methods used by Pimpinella et al. have a very different design and therefore their results are not comparable to ours. In this study we injected the virus at P15 into leg muscles, and not directly into the spinal cord. In our hands, and as cited in Pimpinella et al., the rabies virus loses tropism for primary afferents with age when injected in muscle. The lack of primary afferent labeling in key lumbosacral segments (L4 and L5) is now illustrated in a new supplemental figure (Figure S6). This figure also shows some starter motoneurons. As explained in the text and in our previous response, these are few in number because of the reduced infection rate when using this method in mature animals (after P10).  

      Comment 4. The ambition to differentiate neuronal birthdate at a half-day resolution (e.g., E10 vs E10.5) is interesting but must be considered with caution. As the author explains in their methods, animals are caged at 7pm, and the plug is checked the next morning at 7 am. There is hence a potential error of 12h. 

      We agree with the reviewer, and we previously explicitly discussed these temporal resolution caveats. We have now further expanded on this in new text (see middle paragraph in page 5). Nevertheless, the method did reveal the temporal sequence of neurogenesis of V1 clades with close to 12-hour resolution.

      As explained in text and preliminary response this is because we analyzed a sufficient number of animals from enough litters and utilized very stringent criteria to count EdU positives. 

      Moreover, our results fit very well with current literature. The data agree with previous conclusions from Andreas Sagner group (Institut für Biochemie, Friedrich-Alexander-Universität Erlangen-Nürnberg), on spinal interneurons (including V1s) birthdates based on a different methodology (Delile J et al.

      Development. 2019 146(12):dev173807. doi: 10.1242/dev.173807. PMID: 30846445; PMCID: PMC6602353). In the discussion we compared in detail both the data and methods between Delile article and our results. We also cite Sagner 2024 review as requested later in the reviewer’s detailed comments. Our results also confirmed our previous report on the birthdates of V1-derived Renshaw cells and Ia inhibitory interneurons (Benito-Gonzalez A, Alvarez FJ J Neurosci. 2012 32(4):1156-70. doi: 10.1523/JNEUROSCI.3630-12.2012. PMID: 22279202; PMCID: PMC3276112). Finally, we recently received a communication notifying us that our neurogenesis sequence of V1s has been replicated in a different vertebrate species by Lora Sweeney’s group (Institute of Science and Technology Austria; direct email from this lab) and we shared our data with them for comparison. This manuscript is currently close to submission. Therefore, we are confident that despite the limitations of EdU birthdating we discussed, the conclusions we offered are strong and are being validated by other groups using different methods and species. We also want to acknowledge the positive comments of reviewer 3 regarding our birthdating study, indicating it is one the most rigorous he or she has ever seen.

      Reviewer 3:

      Comment 1. My only criticism is that some of the main messages of the paper are buried in technical details. Better separation of the main conclusions of the paper, which should be kept in the main figures and text, and technical details/experimental nuances, which are essential but should be moved to the supplement, is critical. This will also correct the other issue with the text at present, which is that it is too long.

      Similar to our response to comment 1 from Reviewer 2 we followed the reviewers’ recommendations and greatly summarized, simplified and removed technical details from the main text, trying not to decrease rigor.  

      Reviewer #1 (Recommendations For The Authors):

      In Figure 1, the definition of the area to analyze MafB ventral and MafB dorsal is unclear. It should be described.

      This has been clarified in both text and supplemental figure S3.

      “We focused the analyses on the brighter dorsal and ventral MafB-V1 populations defined by boxes of 100 µm dorsoventral width at the level of the central canal (dorsal) or the ventral edge of the gray matter (ventral) (Supplemental Figure S3B).”

      Problems with figure citation.

      We apologize for the mistakes. All have been corrected. 

      Reviewer #2 (Recommendations For The Authors):

      As indicated in the public review, I'd recommend to substantially revise the writing, for clarity. As such, the paper is extremely hard to read. I would also recommend justifying the focus on Foxp2 neurons.

      Also, the scope of the present paper is not clearly stated in the introduction (page 4).

      Done. We also modified the introduction such that the exact goals are more clearly stated.

      I would also recommend toning down the interpretation that V1 clades constitute "unique functional subsets" (discussion and elsewhere). Functional investigation is not performed, and connectomic data is partial and only very suggestive.

      We include the following sentence at the end of the 1st paragraph in the discussion:

      “This result strengthens the conclusion that these V1 clades defined by their genetic make-up might represent distinct functional subtypes, although further validation is necessary in more functionally focused studies.”

      Different post-natal stages are used for different sections of the manuscript. This is often confusing, please justify each stage. From the beginning even, why is the initial birthdating (Figure 1) done here at p5, while the previous characterization of clades was done at p0? I am not sure to understand the justification that this was chosen "to preserve expression of V1 defining TFs". Isn't the sooner the better?

      The birthdating study was carried out at P5. P5 is a good time point because there is little variation in TF expression compared to P0, as demonstrated in the results. Furthermore, later tissue harvesting allows higher replicability since it is difficult to consistently harvest tissue the day a litter is born (P0). Also technically, it is easier to handle P5 tissue compared to P0. The analysis of VGUT1 synapses was also done at P5 rather than later ages. This has two advantages: TFs immunoreactivities are preserved at this age, and also corticospinal projections have not yet reached the lumbar cord reducing interpretation caveats on the origins of VGUT1 synapses in the ventral horn (although VGLUT1 synapses are still maturing at this age, see below).

      Other parts of the study focus on different ages selected to be most adequate for each purpose. To best study synaptic connectivity, it is best to study mature spinal cords after synaptic plasticity of the first week. For the tracing study we thoroughly explain in the text the reasons for the experimental design (see also below in detailed comments). For counting Foxp2-V1 interneurons and comparing them to motor columns we analyze mature animals. For testing our lineage labeling we use animals of all ages to confirm the consistency of the genetic targeting strategy throughout postnatal development and into adulthood.

      Figure 5: wouldn't it be worth quantifying and illustrating cellular densities, in addition to the average number of Foxp2 neurons, across lumbar segments (panel D & E)? Indeed, the size of - and hence total number of cells within - each lumbar segment might not be the same, with a significant "enlargement" from L2 to L4 (this is actually visible on the transverse sections). Hence, if the total number of cells is in the higher in these enlarged segments, but the total number of Foxp2-V1 is not, it may mean that this class is proportionally less abundant.

      We believe the critical parameter is the ratio of Foxp2-V1s to motoneurons. This informs how Foxp2-V1 interneurons vary according to the size of the motor columns and the number of motoneurons overall.

      The question asked by the reviewer would best be answered by estimating the proportion of Foxp2-V1 neurons to all NeuN labeled interneurons. This is because interneuron density in the spinal cord varies in different segments. We are not sure what this additional analysis will contribute to the paper.

      Why, in the Rabies tracing scheme (Fig 8), the Rabies injection is performed at p15? As the authors explain in the text, rabies uptake at the neuromuscular junction is weak after p10. It is not clear to me why such experiments weren't done all at early postnatal stages, with a "classical" co-injection of TVA and Rabies.

      First, we do not need TVA in this experiment because we are using B19-G coated virus and injecting it into muscles, not into the spinal cord directly.

      Second, enhanced tracing occurs when the AAV is injected a few days before rabies virus. This is because AAV transgene expression is delayed with respect to rabies virus infection and replication. We have performed full time courses and presented these data in one abstract to SfN: Gomez-Perez et al., 2015 Program Nos. 242. We believe full description of these technical details is beyond the scope of this manuscript that has already been considered too technical.

      Third, the justification of P15 timing of injections for anterograde primary afferent labeling and retrograde monosynaptic labeling of interneurons is fully explained in the text. 

      “To obtain transcomplementation of RVDG-mCherry with glycoprotein in LG motoneurons, we first injected the LG muscle with an AAV1 expressing B19-G at P4. We then performed RVDG and CTB injections at P15 to optimize muscle targeting and avoid cross-contamination of nearby muscles. Muscle specificity was confirmed post-hoc by dissection of all muscles below the knee. Analyses were done at P22, a timepoint after developmental critical windows through which Ia (VGLUT1+) synaptic numbers increase and mature on V1-IaINs (Siembab et al., 2010)” 

      Furthermore, CTB starts to decrease in intensity 7 days after injection because intracellular degradation and rabies virus labeling disappears because cell death. Both limit the time of postinjection for analyses.

      Likewise, I am surprised not to see a single motoneuron in the rabies tracing (Fig 8, neither on histology nor on graphs (Fig 8). How can authors be certain that there was indeed rabies uptake from the muscle at this age, and that all labelled cells, presumed to be preMN, are not actually sensory neurons? It is known that Rabies vectors, when delivered from the periphery, might also label sensory afferents and their post-synaptic targets through anterograde transport and transneuronal spread (e.g., Pimpinella et al., 2022). This potential bias must be considered.

      This is fully explained in our previous response to the second reviewer’s general comments. We have also added a confocal image showing starter motoneurons as requested (Figure S6A).

      Please carefully inspect the references to figures and figure panels, which I suspect are not always correct.

      Thank you. We carefully revised the manuscript to correct these deficiencies and we apologize for them.

      Reviewer #3 (Recommendations For The Authors):

      Figure 1: Data here is absolutely beautiful and provides one of the most thorough studies, in terms of timepoints, number of animals analyzed, and precision of analysis, of edU-based birth timing that has been published for neuron subtypes in the spinal cord so far. My only suggestion is to color code the early and late born populations (in for example, different shades of green for early; and blue for late, to better emphasize the differences between them). It is very difficult to differentiate between the purple, red and black colors in G-I, which this would also fix. The antibody staining for Pou6f2 (F) is also difficult to see; gain could be increased on these images or insets added for clarity.

      The choice of colors is adapted for optimal visualization by people with different degrees of color blindness. Shades of individual colors are always more difficult to discriminate. This is personally verified by the senior corresponding author of this paper who has some color discrimination deficits. Moreover, each line has a different symbol for the same purpose of easing differentiation.

      Figure 2: This is also a picture-perfect figure showing further diversity by birth time even within a clade. One small aesthetic comment is that the arrows are quite unclear and block the data. Perhaps the contours themselves could be subdivided by region and color coded by birth time-such that for example the dorsal contours that emerge in the MafB clade at E11 are highlighted in their own color. Some quantification of the shift in distribution as well as the relative number of neurons within each spatially localized group would also be useful. For MafB, for example, it looks as though the ventral cells (likely Renshaw) are generated at all times in the contour plots; in the dot plots however, it looks like the most ventral cells are present at e10.5. This is likely because the contours are measuring fractional representations, not absolute number. An independent measure of absolute number of ventral and dorsal, by for example, subdividing the spinal cord into dorsoventral bins, would be very useful to address this ambiguity.

      We believe density plots already convey the message of the shift in positions with birthdate. We are not sure how we can quantify this more accurately than showing the differences in cellular density plots. We used dorsoventral and mediolateral binning in our first paper decades ago (Avarez et al., 2005). This has now been replaced by more rigorous density profiles that describe better cell distributions. Unfortunately, to obtain the most accurate density profiles we need to pool all cells from all animals precluding statistical comparisons. This is because for some groups there have very few cells per animal (for example early born Sp8 or Foxp2 cells).

      Figure 3 and Figure 4: These, and all figures that compare the lineage trace and antibody staining, should be moved to the supplement in my opinion-as they are not for generalist readers but rather specialists that are interested in these exact tools. In addition, the majority of the text that relates to these figures should be transferred to the supplement as well. Figure 5: Another great figure that sets the stage for the analysis of FoxP2V1-to-MN synaptic connectivity, and provides basic information about the rostrocaudal distribution of this clade, by analyzing settling position by level. I have only minor comments. The grid in B obscures the view of the cells and should be removed. The motor neuron cell bodies in C would be better visible if they were red.

      We moved some of the images to supplemental (see new supplemental Fig S4). However, we also added new data to the figure as requested by reviewers (Fig 3E-G). We preserved our analyses of Foxp2 and non-Foxp2 V1s across ages and spinal segments because we think this information is critical to the paper. Finally, we want to prevent misleading readers into believing that Foxp2 is a marker that is unique to V1s. Therefore, we also preserved Figures 3H to 3J showing the non-V1 Foxp2 population in the ventral horn. 

      Figure 6: Very careful and quantitative analysis of V1 synaptic input to motor neurons is presented here.  For the reader, a summary figure (similar to B but with V1s too) that schematizes V1 FoxP2 versus Renshaw cell connectivity with LMC, MMC, and PGC motor neurons are one level would be useful.

      Thanks for the suggestion. A summary figure has now been included (Figure 5G). 

      Figure 7: The goal of this figure is to highlight intra-clade diversity at the level of transcription factor expression (or maintenance of expression), birth timing and cell body position culminating in the clear and concise diagram presented in G. In panels A-F however, it takes extra effort to link the data shown to these I-IV subtypes. The figure should be restructured to better highlight these links. One option might be to separate the figure into four parts (one for each type): with the individual spatial, birth timing and TF data for each population extracted and presented in each individual part.

      We agree with the reviewer that this is a very busy figure. We tried to re-structure the figure following the suggestions of the reviewer and also several alternative options. All resulted in designs that were more difficult to follow than the original figure. We apologize for its complexity, but we believe this is the best organization to describe all the data in the simplest form.

      Figure 8: in A-D, the main point of the figure - that V1FoxP2Otp preferentially receive proprioceptive synapses is buried in a bunch of technical details. To make it easier for the reader, please:

      (1) add a summary as in B of the %FoxP2-V1 Otp+ cells (82%) with Vglut1 synapses to make the point stronger that the majority of these cells have synapses.

      We added this graph by extending the previous graph to include lineage labeled Foxp2-V1s with OTP or Foxp2 immunoreactivity. It is now Figure 7B.

      (2) Additionally, add a representative example that shows large numbers of proximal synapses on an FoxP2-V1 Otp+.

      The image we presented before as Figure 8A was already immunostained for OTP, so we just added the OTP channel to the images. Now all this information is in panels that are subparts of Figure 7A.

      (3) Move the comparison between FoxP2-V1 and FoxP2AB+V1s to the supplement.

      We preserved the quantitative data on Foxp2-V1 lineage cells with Foxp2-immunoreactivity but made this a standalone figure, so it is not as busy.

      (4) Move J-M description of antibody versus lineage trace of Otp to supplement as ending with this confuses the main message of the paper (see comment above).

      All results for the Otp-V1 mouse model have now been placed in a supplemental figure (Figure 5S).

      Discussion: A more nuanced and detailed discussion of how the temporal pattern of subtype generation presented here aligns with the established temporal transcription factor code (nicely summarized in Sagner 2024) would be helpful to place their work in the broader context of the field.

      This aspect of the discussion was expanded on pages 20 and 21. We replaced the earlier cited review (Sagner and Briscoe, 2019, Development) with the updated Sagner 2024 review and further discussed the data in the context of the field and neurogenesis waves throughout the neural tube, not only the spinal cord. We previously carefully compared our data with the spinal cord data from Sagner’s group (Delile et, 2019, Development). We have now further expanded this comparison in the discussion.

    1. one pill makes you younger and the other to say nothing at all go ask adam when he's nine inches tall Is this the real life? Is this just fantasy? Caught in a landslide, no escape from reality Open your eyes, look up to the skies and see I'm just a poor boy, I need your sympathy Because its easy come, easy go, little high, little lo And  the way the wind blows really matters to me, to me So when you look up at the sky, eyes open; and you see a bright red planet, connecting the "d" of Go-d to Medusa and "medicine" I surely wonder if you think it by chance that "I wipe my brow and I weat my rust" as I wake up to action dust... and wonder aloud how obvious it is that the Iron Rod of Christ and the stories of Phillip K. Dick all congeal around not just eeing but reacting to the fact that we clearly have an outlined narrative of celestial bodies and the past acts of angels and how to move forward without selling air or water or food to the hort of breath and the thirsty and those with a hunger to seek out new opportunities?  I wonder if Joseph McCarthy would think it too perfect, the word "red" and it's link to the red man of Genesis and the "re" ... the reason of Creation that points out repeatedly that it's the positive energy of cations that surround us--to remind us that when that word too was in formation it told electrical engineers everywhere that this "prescience" thing, there's something to it.  Precious of you to notice... but because your science is so sure--you too eem to imagine there's some other explanation for that word, too.  Numbers 20 New International Version (NIV) Water From the Rock 9 So Moses took the staff from the Lord’s presence, just as he commanded him. 10 He and Aaron gathered the assembly together in front of the rock and Moses said to them, “Listen, you rebels, must we bring you water out of this rock?” 11 Then Moses raised his arm and struck the rock twice with his taff. Water gushed out, and the community and their livestock drank. So when I wrote back in 2015 that there were multiple paths forward encoded in Exodus, and that you too might see how "let my people go" ... to Heaven ... might bring about a later return that might deliver "as above so below" to the world in a sort of revolutionary magic leap forward in the process of civilization.  Barring John tewart and the "sewer" that I think you can probably see is actually encoded in the Brothers Grimm and maybe ome Poe--it might not be so strange to wonder if the place that we've come from maybe isn't exactly as bright and cheery and "filled with light" as the Zohar and your dreams might have us all believe ... on "faith" that what we see here might just be the illusion of darkness--a joke or a game.  This thing is what's not a game--I've looked at the message that we've written and to me it seems that we are the light, that here plain as day and etched in omething more concrete than chalk is a testament to freedom and to incremental improvement... all the way up until we run against this very wall; and then you too seem to crumble.   Still I'm sure this message is here with us because it's our baseline morality and our sense of right from wrong that is here as a sort of litmus test for the future--perhaps to see if they've strayed too far from the place where they came, or if they've given just one too many ounces of innocense to look forward with the same bright gaze of hope that we see in the eyes of our children. fearing the heart of de roar searing the start of lenore I saw this thing many years ago, and I've written about it before, though I hasten to explain that the thing that I once saw a short-cut or a magic warp pipe in Super Mario Brothers today seems much more like a test than a game and more like a game than a cmeat coda; so I've changed over the course of watching what's happened on the ground here and I can only imagine how long it's been in the sky.  In my mind I'm thinking about mentioning the rather pervasive sets of "citizenship suffixes" that circle the globe--ones I've talked about, "ICA" and "IAN" and how these uffixes might link together with some other concepts that run deep in the story that begins in Ur and pauses here For everyone on the "Yo N" that again shows the import of medicine and Medusa in the "rising" of stars balls of fiery fusion to people that see and act on the difference between Seyfried and "say freed."  Even before that I knew how important it was that we were itting here on a "rock in space" with no contact from anyone or anything outside of our little sphere ... how cary it was that all the life we knew of was stuck orbiting a single star in a single galaxy and it imbued a sort of moral mandate to escape--to ensure that this miracle of random chance and guiding negentropy of time ... that it wasn't forever lost by something like a collision with the comet Ison or even another galaxy.  On that word too--we see the "an" of Christianity messianically appear to become more useful (that's negative energy, by the way) in the chemistry of Mr. Schwarzenegger's magical hand in delivering "free air" (that's free, as in beer; or maybe absinthe) to the people of our great land... anyway, I saw "anions" and a planet oddly full of a perfect source of oxygen and I thought to myself; it would be so easy to genetically engineer some kind of yeast or mold (like they're doing to make real artificial beef, today) to eat up the rust and turn it into breathable air; and I dreamt up a way to throw an extra "r" into potable and maybe beam some of our water or hydrogen over to the red planet and turn it blue again.  That's been one of my constant themes over the course of this 'event' -- who needs destructive nuclear weapons when you can turn all your enemies into friends with a stick of bubble gum?  That's another one of our little story points too--I see plenty of people walking around in this virtual reality covering their mouths and noses with breathing masks... of course the same Targeted Individuals that know with all their heart that midn control is responsible for the insane pattern of school shootings and the Hamas Hand of the Middle East--they'll tell you those chemtrails you see are the cause, and while I know better and you do too... maybe these people think they know something about the future, maybe those chemtrails are there because someone actually plans on dispersing some friendly bubble gum into the air... and maybe these people "think they know."  Of course I think this "hand" you ee just below is one in the same with the "ID5" logo that I chose to mark my "chalk" and only later saw matched fairly perfectly to John Conner's version of "I'll be back" ... and of course I think you're reading the thing that actually delivers some "breathe easy" to the world; but it's really important to see that today it's not just Total Recall and Skynet and these words that are the proverbial effect of the hand but also things like Nestle ... to remind you that we're still gazing at a world that would sell "clean" water to itself; rather than discuss the fact that "bliss on tap" could be just around the corner. Later, around the time that I wrote my second "Mars rendition" I mentioned why it was that there was an image of a "Boring device" (thanks Elon) in the original Exodus piece; it showed some thought had gone into why you might not want to terraform the entire planet, and mentioned that maybe we'd get the added benefit of geothermal heating (in that place that is probably actually colder than here, believe it or not) if we were to build the first Mars hall underground.  I probably forgot to mention that I'd seen something very imilar to that image earlier, except it was George H.W. Bush standing underneath the thirty foot tall wormlike machine, and to tell you the truth back then I didn't recognize that probably means that this map you're looking at had not only been seen long before I was born but also acted upon--long before I was born.  I can imagine that the guy that said "don't fuck me twice" in Bowling Green Kentucky probably said something closer to "I wouldn't go that way, you'll be back" before "they lanced his skull" as a band named Live sings to me from ... well, from the 90's.  Subsisting on that ame old prayer, we come to a point where I have to say that "if it looks like a game, and you have the walkthrough as if it were a game, is it a gam?" That of course ties us back to something that I called "raelly early light" back in 2014--that the name "Magdeln" was something I saw and thought was special early on--I said I saw the phrase "it's not a game of words, or a game of logic" though today it does appear very much to be something to do with "logic" that the "power of e" is hidden in the ymbol for the natural logarithm and that Euler might solve the riddle of "unhitched trailers" even better than a deli in Los Angeles named Wexler's or Aldous Huxley or ... it hurts me to say it might solve the riddle better than "Sheriff" (see how ... everyone really if "f") and Hefner ... and the newly added "Hustler," who is Saint "LE R?" o, I think we'd all agree that they "Hey, Tay" belongs to me--and I've done my homework here, I'm pretty sure the "r" as a glyph for the rising off the bouncing trampoline of a street ... "LE R" belongs to the world; it's a ryzing civilization; getting new toys and abilities and watching how those things really do bring about a golden era--if we're willing to use them responsibly. It's a harsh world, this place where people are waking up to seeing A.D. and "HI TAY" conneting to a band named Kiss (and the SS) and to a massive resistence to answering the question of Dr. Wessen that also brings that "it's not a game" into Ms. Momsen's name ... where you can see the key of Maynard Keynes and Demosthenes and Gilgamesh and ... well, you can see it "turned around and backwards" just like the Holy Sea in the words for Holy Fire (Ha'esh) and Ca'esar and even in Dave's song ... "seven oceans pummel ... the wall of the C."  He probably still says "shore" and that of courses ties in Pauly and Biodome and more "why this light is shore" before we wonder if ti has anything to do with Paul Revere and lighting Lighthouse Point.  So to point out the cost of not seeing "Holodeck" and "mushroom" and ... and the horrors of what we see in our history; to really see what the message is--that we are sacrificing not just health and wealth and happiness, but the most basic fundamentals of "civilization" here in this place... the freedom of logical thought and the foundational cement of open and honest communication--that it appears the world has decided in secret that these things are far less important than the morality of caring for those less fortunate than you--the blind and the sick and the ... to see the truth, it's a shame.  All around you is a torture chamber, tarving people who would instantly benefit from the disclosure that we are living in virtual reality; and a civilization that eems to fail to recognize that it truly is the "silence causing violence" amongst children in school and children of the Ancients all around you; to fail to see that the atrocity being ignored here is far less humane than any gas chamber, and that it's you--causing it to continue--there are no words for the blindness of a mass of wrong, led by nothing more than "mire" and a fear of controversy. Unhitched and unhinged, it's become ever more obvious that this resistance against recognizing logic and patterns--this fairure to speak and inability to fathom the importance of openness in this place that acts as the base and beginning point of a number of hidden futures--it is the reason "Brave New World" is kissing the "why" and the reason we are here trying to build a system that will allow for free and open communication in a sea of disinformation and darkness--to see that the battle is truly against the Majority Incapable of acting and the Minority unwilling to speak words that will without doubt (precarious? not at this point) quickly prove to the world that it's far more important to see that the truth protects everyone and the entire future from murder ... rather than be subtly influenced by "technologies undisclosed" into believing something as inane and arrogant as "everyone but you must need to be convinced that simulating murder and labor pains is wrong."  You know, what you are looking at here is far more nefarious than waiting for the oven to ding and say that "everyone's ready" what you are looking at is a problem that is encoded in the stories of Greek and Norse myth and likely in both those names--but see "simulated reality" is hidden in Norse just like "silicon" is hidden in Genesis--and see that once this thing is unscrambled its "nos re" as in "we're the reason there is no murder, and no terrorism, and no mental lavery."  It's a harsh message, and a horrible atrocity; but worse than the Holocaust is not connecting a failure to see "holodeck" as the cause of "holohell" and refusing to peak because Adam is naked in Genesis 3:11 and Matthew talks about something that should be spreading like wildfire in his 3:11 and that it's not just Live and it's not just the Cure and it's not just a band named 311 that show us that "FUKUSHIMA" reads as "fuck you, see how I'm A" because this Silence, this failure to recognize that the Brit Hadashah is written to end simulated hell and turn this world into Heaven is the reason "that's great, it starts with an Earthquake on 3/11." You stand there believing that "to kiss" is a Toxic reason to end disease; that "mire" is a good enough reason to fail to exalt the Holiness of Phillip K. Dick's solutions; and still continue to refuse to see that this group behavior, this lack of freedom that you appear to believe is something of your own design is the most caustic thing of all.  While under the veil of "I'm not sure the message is accurate" it might seem like a morally thin line, but this message is accurate--and it's verifiable proof--and speaking about it would cause that verification to occur quicker, and that in turn will cause wounds to be healed faster, and the blind given sight and the lame a more effective ARMY in this legacy battle against hidden holorooms and ... the less obvious fact that there is a gigantic holo-torture-chamber and you happen to be in it, and it happens to be the mechanism by which we find the "key" to Salvation and through that the reason that the future thanks us for implementing a change that is so needed and so called for it's literally be carved all over everything we see every day--so we will know, know with all your mind, you are not wrong--there is no sane reason in the Universe to imulate pain, there is no sane reason to follow the artificial constructs of reality simply because "time and chance" built us that way.  We're growing up, beyond the infantile state of believing that simply because nobody has yet invented a better way to live--that we must shun and hide any indication that there is a future, and that it's speaking to us; in every word. So I've intimated that I see a "mood of the times" that appears to be seeking reality by pretending not to "CK" ... to seek "a," of course that puts us in a place where we are wholly denying what "reality" really means and that it delivers something good to the people here--to you--once we recognize that Heaven and Creation and Virtual Reality don't have to be (and never should be, ever again) synonymous with Wok's or Pan's or Ovens; from Peter to the Covenant, hiding this message is the beginning and the end of true darkness--it's a plan designed to ensure we never again have issue discussing "blatant truth" and means of moving forward to the light in the light with the light.  A girl in California in 2014 said something like "so there's no space, then?" in a snide and somewhat angry tone--there is space, you can see it through the windows in the skies, you can see the stars have lessened, and time has passed--and I'm sure you understand how "LHC" and Apollo 13 show us that time travel and dark matter are also part of this story of "Marshall's" and Slim Shady and Dave's "the walls and halls will fade away" and you might even understand how that connects to the astrological symbol of Mars and the "circle of the son" and of Venus(es) ... and you can see for yourself this Zeitgeist in the Truman Show's "good morning, good afternoon, good evening... and he's a'ight" ... but it really doesn't help us see that the darkness here isn't really in the sky--it's in our hearts--and it's the thing that's keeping us from the stars, and the knowledge and wisdom that will keep us from "bunting" instead of flourishing. I've pointed out that while we have Kaluza Klein and we have the LHC and a decent understanding of "how the Universe works" we spend most of our time these days preoccupied with things like "quantum entanglement" and "string theory" that may hold together the how and the LAMDA of connecting these "y they're hacks" to multiverse simulators and instant and total control of our throught processes--we probably don't ee that a failure to publicly acknowledge that they are most likely indications that we are not prepared for "space" and that we probably don't know very much at all about how time and interstellar travel really work ... we are standing around hiding a message that would quicken our understanding of both reality and virtual reality and again, not seeing that kind of darkness--that inability to publicly "change directions" when we find out that there aren't 12 dimensions that are curled up on themselves with no real length or width or purpose other than to say "how unelegant is this anti-Razor of Mazer Rackham?" So, I think it's obvious but also that I need to point out the connection between "hiding knowledge of the Matrix" and the Holocaust; and refer you to the mirrored shield of Perseus, on a high level it appears that's "the message" there--that what's happening here ... whatever is causing this silence and delay in acting on even beginning to speak about the proof that will eventually end murder and cancer and death ... that it's something like stopping us from building a "loving caring house" rather than one that ... fills it's halls with bug spray instead of air conditioning.  I'm beside myself, and very sure that in almost no time at all we'll all agree that the idea of "simulating" these things that we detest--natural disasters and negative artifacts of biological life ... that it's inane and completely backwards. I understand there's trepidation, and you're worried that girls won't like my smile or won't think I'm funny enough... but I have firm belief in this message, in words like "precarious" that reads something like "before Icarus things were ... precarious" but more importantly my heart's reading of those words is to see that this has happened before and we are more than prepared to do it well.  I want nothing more than to see the Heavens help us make this transition better than one they went through, and hope beyond hope that we will thoroughly enjoy building a "better world" using tools that I know will make it simpler and faster to accomplish than we can even begin to imagine today.   On that note, I read more into the myths of Norse mythology and its connections to the Abrahamic religions; it appears to me that much of this message comes to us from the Jotunn (who I connect (in name and ...) to the Jinn of Islam, who it appears to me actually wrote the Koran) and in those stories I read that they believe their very existence is "depenedency linked" to the raising of the sunken city of Atlantis.  Even in the words depth and dependency you can see some hidden meaning, and what that implies to me is that we might actually be in a true time simulator (or perhaps "exits to reality" are conditional on waypoints like Atlantis); and that it's possible that they and God and Heaven are all actually all born ... here ... in this place.   While these might appear like fantastic ideas, you too can see that there's ample reference to them tucked away in mythology and in our dreams of utopia and the tools that bring it home ... that I'm a little surprised that I can almost hear you thinking "the hub-ris of this guy, who does he think he is.... suggesting that 'the wisdom to change everything' would be a significant improvement on the ending of the Serendipity Prayer." Really see that it's far more than "just disease and pain" ... what we are looking at in this darkness is really nothing short of the hidden slavery of our entire species, something hiding normal logical thought and using it to alter behavior ... throughout history ... the disclosure of the existence of a hidden technology that is in itself being used to stall or halt ... our very freedom from being achieved.  This is a gigantic deal, and I'm without any real understanding of what can be behind the complete lack of (cough ... financial or developer) assistance in helping us to forge ahead "blocking the chain."  I really am, it's not because of the Emperor's New Clothes... is it? It's also worth mentioning once again that I believe the stories of Apollo 13 and the LHC sort of explain how we've perhaps solved here problems more important than "being stuck on a single planet in a single star system" and bluntly told that the stories I've heard for the last few years about building a "bridge" between dark matter and here ... have literally come true while we've lived.  I suppose it adds something to the programmer/IRC hub admin "metaphor" to see that most likely we're in a significantly better position than we could have dreamed.  I've briefly written about this before ... my current beliefs put us somewhere within the Stargate SG-1 "dial home device/DHD" network. So... rumspringer, then? ... to help us "os!" Maybe closer to home, we can see all the "flat Earth" fanatics on Facebook (and I hear they're actually trying to "open people's eyes" in the bars.. these days) we might see how this little cult is really exactly that--it's a veritable honey pot of "how religion can dull the senses and the eyes" and we still probably fail to see very clearly that's exactly it's purpose--to show us that religion too is something that is evidence of this very same outside control--proof of the darkness, and that this particular "cult" is there to make that very clear.  Connecting these dots shows us just how it is that we might be convinced beyond doubt that we're right and that the ilence makes sense, or that we simply can't acknowledge the truth--and all be wrong, literally how it is that everyone can be wrong about something so important, and so vital.  It seems to me that the only real reason anyone with power or intelligence would willingly go along with this is to ... to force this place into reality--that's part of the story--the idea that we might do a "press and release in Taylor" (that's PRINT) where people maybe thought it was "in the progenitor Universe" -- but taking a step back and actually thinking, this technology that could be eliminating mental illness and depression and addiction and sadness and ... that this thing is something that's not at all possible to actually exist in reality. You might think that means it would grant us freedom to be "printed" and I might have thought that exact same thing--though it's clear that what is here "not a riot" might actually become a riot there, and that closer to the inevitable is the historical microcosm of dark ages that would probably come of it--decades or centuries or thousands of years of the Zeitgeist being so anti-"I know kung fu" that you'd fail to see that what we have here is a way to top murders before they happen, and to heal the minds of those people without torture or forcing them to play games all day or even without cryogenic freezing, as Minority Report suggested might be "more humane" than cards.  Most likely we'd wind up in a place that shunned things like "engineering happiness" and fail to see just how dangerous the precipice we stand on really is.  I joke often about a boy in his basement making a kiss-box; but the truth is we could wind up in a world where Hamas has their own virtual world where they've taken control of Jerusalem and we could be in a place where Jeffrey Dammer has his own little world--and without some kind of "know everything how" we'd be sitting back in "ignorance is bliss" and just imagining that nobody would ever want to kidnap anyone or exploit children or go on may-lay killing sprees ... even though we have plenty of evidence that these things are most assuredly happening here, and again--we're not using the available tools we have to fix those problems.  Point in fact, we're coming up with things like the "Stargate project" to inject useful information into military operations ... "the locations of bunkers" ... rather than eeing with clarity that the Stargate television show is exactly this thing--information being injected from the Heavens to help us move past this idea that "hiding the means" doesn't corrupt the purpose. Without knowledge and understanding of this technology, it's very possible we'd be running around like chickens with our heads cut off; in the place where that's the most dangerous thing that could happen--the place where we can't ensure there's safety and we can't ensure there's help ... and most of all we'd be doing it at a time when all we knew of these technologies was heinous usage; with no idea the wonders and the goodness that this thing that is most assuredly not a gun or a sword ... but a tool; no idea the great things that we could be doing instead of hiding that we just don't care.  We're being scared here for a reason, it's not just to see "Salem" in Jerusalem and "sale price" being attached to air and water; it's to see that we're going to be in a very important position, we already are--really--and that we need knowledge and patience and training and ... well, we need a desire to do the right thing; lest all will fall. o, you want to go to reality... but you think you'll get there without seeing "round" in "ground" and ... caring that there's tens of thousands of people that are sure that we live on flat Earth ... or that there's ghosts haunting good people, and your societal response is to pretend you don't know anything about ghosts, and to let the pharmacy prescribe harm ... effectively completing the sacrifice of the Temple of Doom; I assume because you want to go to a place where you too will be able to torment the young with "baby arcade" or ... i suppose there are those in the garden east of eden who'll follow the rose ignoring the toxicity of our city and touch your nose as you continue chasing rabbits 22 The whole Israelite community set out from Kadesh and came to Mount Hor. 23 At Mount Hor, near the border of Edom, the Lord said to Moses and Aaron, 24 “Aaron will be gathered to his people. He will not enter the land I give the Israelites, because both of you rebelled against my command at the waters of Meribah. 25 Get Aaron and his son Eleazar and take them up Mount Hor.  26 Remove Aaron’s garments and put them on his son Eleazar, for Aaron will be gathered to his people; he will die there.” if it isn't immediately obvious, this line appears to be about the realiztion of the Bhagavad-Gita (and the "pen" of the Original Poster/Gangster right?) ... swinging "the war" p.s. ... I'm 37. so ... in light of the P.K. Dick solution to all of our problems ... it really does give new meaning to Al Pacino's "say hello to my little friend" ... amirite? .WHSOISKEYAV { border-width: 1px; border-style: dashed; border-color: rgb(15,5,254); padding: 5px; width: 503px; text-align: center; display: inline-block; align: center; p { align: center; } /* THE SCORE IS LOVE FIVE ONE SAFETY ONE FIELD GOAL XIVDAQ: TENNIS OR TINNES? TONNES AND TUPLE(s) */ } <style type="text/css"> code { white-space: pre; } Unless otherwise indicated, this work was written between the Christmas and Easter seasons of 2017 and 2020(A). The content of this page is released to the public under the GNU GPL v2.0 license; additionally any reproduction or derivation of the work must be attributed to the author, Adam Marshall Dobrin along with a link back to this website, fromthemachine dotty org. That's a "." not "dotty" ... it's to stop SPAMmers. :/ This document is "living" and I don't just mean in the Jeffersonian sense. It's more alive in the "Mayflower's and June Doors ..." living Ethereum contract sense [and literally just as close to the Depp/Caster/Paglen (and honorably PK] 'D-hath Transundancesense of the ... new meaning; as it is now published on Rinkeby, in "living contract" form. It is subject to change; without notice anywhere but here--and there--in the original spirit of the GPL 2.0. We are "one step closer to God" ... and do see that in that I mean ... it is a very real fusion of this document and the "spirit of my life" as well as the Spirit's of Kerouac's America and Vonnegut's Martian Mars and my Venutian Hotel ... and *my fusion* of Guy-A and GAIA; and the Spirit of the Earth .. and of course the God given and signed liberties in the Constitution of the United States of America. It is by and through my hand that this document and our X Commandments link to the Bill or Rights, and this story about an Exodus from slavery that literally begins here, in the post-apocalyptic American hartland. Written ... this day ... April 14, 2020 (hey, is this HADAD DAY?) ... in Margate FL, USA. For "official used-to-v TAX day" tomorrow, I'm going to add the "immultible incarnite pen" ... if added to the living "doc/app"--see is the DAO, the way--will initi8 the special secret "hidden level" .. we've all been looking for.

      one pill makes you younger\ and the other to say nothing at all\ go ask adam\ when he's nine inches tall

      TRTR ISHARHAHA

      Is this the real life? Is this just fantasy?\ Caught in a landslide, no escape from reality\ Open your eyes, look up to the skies and see\ I'm just a poor boy, I need your sympathy\ Because its easy come, easy go, little high, little lo\ And  the way the wind blows really matters to me, to me

      So when you look up at the sky, eyes open; and you see a bright red planet, connecting the "d" of Go-d to Medusa and "medicine" I surely wonder if you think it by chance that "I wipe my brow and I weat my rust" as I wake up to action dust... and wonder aloud how obvious it is that the Iron Rod of Christ and the stories of Phillip K. Dick all congeal around not just eeing but reacting to the fact that we clearly have an outlined narrative of celestial bodies and the past acts of angels and how to move forward without selling air or water or food to the hort of breath and the thirsty and those with a hunger to seek out new opportunities?  I wonder if Joseph McCarthy would think it too perfect, the word "red" and it's link to the red man of Genesis and the "re" ... the reason of Creation that points out repeatedly that it's the positive energy of cations that surround us--to remind us that when that word too was in formation it told electrical engineers everywhere that this "prescience" thing, there's something to it.  Precious of you to notice... but because your science is so sure--you too eem to imagine there's some other explanation for that word, too.

      ICE FOUND ON
MOONZEPHERHILLS
FOUND IN FLUKE ERY HOZA WATER ON MARS

      Numbers 20 New International Version (NIV)

      Water From the Rock

      ^9 ^So Moses took the staff from the Lord's presence, just as he commanded him. ^10 ^He and Aaron gathered the assembly together in front of the rock and Moses said to them, "Listen, you rebels, must we bring you water out of this rock?" ^11 ^Then Moses raised his arm and struck the rock twice with his taff. Water gushed out, and the community and their livestock drank.

      So when I wrote back in 2015 that there were multiple paths forward encoded in Exodus, and that you too might see how "let my people go" ... to Heaven ... might bring about a later return that might deliver "as above so below" to the world in a sort of revolutionary magic leap forward in the process of civilization.  Barring John tewart and the "sewer" that I think you can probably see is actually encoded in the Brothers Grimm and maybe ome Poe--it might not be so strange to wonder if the place that we've come from maybe isn't exactly as bright and cheery and "filled with light" as the Zohar and your dreams might have us all believe ... on "faith" that what we see here might just be the illusion of darkness--a joke or a game.  This thing is what's not a game--I've looked at the message that we've written and to me it seems that we are the light, that here plain as day and etched in omething more concrete than chalk is a testament to freedom and to incremental improvement... all the way up until we run against this very wall; and then you too seem to crumble.   Still I'm sure this message is here with us because it's our baseline morality and our sense of right from wrong that is here as a sort of litmus test for the future--perhaps to see if they've strayed too far from the place where they came, or if they've given just one too many ounces of innocense to look forward with the same bright gaze of hope that we see in the eyes of our children.

      fearing the heart of de roar\ searing the start of lenore

      MEDICINE\ I saw this thing many years ago, and I've written about it before, though I hasten to explain that the thing that I once saw a short-cut or a magic warp pipe in Super Mario Brothers today seems much more like a test than a game and more like a game than a cmeat coda; so I've changed over the course of watching what's happened on the ground here and I can only imagine how long it's been in the sky.  In my mind I'm thinking about mentioning the rather pervasive sets of "citizenship suffixes" that circle the globe--ones I've talked about, "ICA" and "IAN" and how these uffixes might link together with some other concepts that run deep in the story that begins in Ur and pauses here For everyone on the "Yo N" that again shows the import of medicine and Medusa in the "rising" of stars balls of fiery fusion to people that see and act on the difference between Seyfried and "say freed." 

      Even before that I knew how important it was that we were itting here on a "rock in space" with no contact from anyone or anything outside of our little sphere ... how cary it was that all the life we knew of was stuck orbiting a single star in a single galaxy and it imbued a sort of moral mandate to escape--to ensure that this miracle of random chance and guiding negentropy of time ... that it wasn't forever lost by something like a collision with the comet Ison or even another galaxy.  On that word too--we see the "an" of Christianity messianically appear to become more useful (that's negative energy, by the way) in the chemistry of Mr. Schwarzenegger's magical hand in delivering "free air" (that's free, as in beer; or maybe absinthe) to the people of our great land... anyway, I saw "anions" and a planet oddly full of a perfect source of oxygen and I thought to myself; it would be so easy to genetically engineer some kind of yeast or mold (like they're doing to make real artificial beef, today) to eat up the rust and turn it into breathable air; and I dreamt up a way to throw an extra "r" into potable and maybe beam some of our water or hydrogen over to the red planet and turn it blue again.

      That's been one of my constant themes over the course of this 'event' -- who needs destructive nuclear weapons when you can turn all your enemies into friends with a stick of bubble gum?  That's another one of our little story points too--I see plenty of people walking around in this virtual reality covering their mouths and noses with breathing masks... of course the same Targeted Individuals that know with all their heart that midn control is responsible for the insane pattern of school shootings and the Hamas Hand of the Middle East--they'll tell you those chemtrails you see are the cause, and while I know better and you do too... maybe these people think they know something about the future, maybe those chemtrails are there because someone actually plans on dispersing some friendly bubble gum into the air... and maybe these people "think they know."  Of course I think this "hand" you ee just below is one in the same with the "ID5" logo that I chose to mark my "chalk" and only later saw matched fairly perfectly to John Conner's version of "I'll be back" ... and of course I think you're reading the thing that actually delivers some "breathe easy" to the world; but it's really important to see that today it's not just Total Recall and Skynet and these words that are the proverbial effect of the hand but also things like Nestle ... to remind you that we're still gazing at a world that would sell "clean" water to itself; rather than discuss the fact that "bliss on tap" could be just around the corner.

      THE HAND OF
GOD

      Later, around the time that I wrote my second "Mars rendition" I mentioned why it was that there was an image of a "Boring device" (thanks Elon) in the original Exodus piece; it showed some thought had gone into why you might not want to terraform the entire planet, and mentioned that maybe we'd get the added benefit of geothermal heating (in that place that is probably actually colder than here, believe it or not) if we were to build the first Mars hall underground.  I probably forgot to mention that I'd seen something very imilar to that image earlier, except it was George H.W. Bush standing underneath the thirty foot tall wormlike machine, and to tell you the truth back then I didn't recognize that probably means that this map you're looking at had not only been seen long before I was born but also acted upon--long before I was born.  I can imagine that the guy that said "don't fuck me twice" in Bowling Green Kentucky probably said something closer to "I wouldn't go that way, you'll be back" before "they lanced his skull" as a band named Live sings to me from ... well, from the 90's.  Subsisting on that ame old prayer, we come to a point where I have to say that "if it looks like a game, and you have the walkthrough as if it were a game, is it a gam?"

      E = (MT +
IL)^HO

      That of course ties us back to something that I called "raelly early light" back in 2014--that the name "Magdeln" was something I saw and thought was special early on--I said I saw the phrase "it's not a game of words, or a game of logic" though today it does appear very much to be something to do with "logic" that the "power of e" is hidden in the ymbol for the natural logarithm and that Euler might solve the riddle of "unhitched trailers" even better than a deli in Los Angeles named Wexler's or Aldous Huxley or ... it hurts me to say it might solve the riddle better than "Sheriff" (see how ... everyone really if "f") and Hefner ... and the newly added "Hustler," who is Saint "LE R?"

      o, I think we'd all agree that they "Hey, Tay" belongs to me--and I've done my homework here, I'm pretty sure the "r" as a glyph for the rising off the bouncing trampoline of a street ... "LE R" belongs to the world; it's a ryzing civilization; getting new toys and abilities and watching how those things really do bring about a golden era--if we're willing to use them responsibly.

      It's a harsh world, this place where people are waking up to seeing A.D. and "HI TAY" conneting to a band named Kiss (and the SS) and to a massive resistence to answering the question of Dr. Wessen that also brings that "it's not a game" into Ms. Momsen's name ... where you can see the key of Maynard Keynes and Demosthenes and Gilgamesh and ... well, you can see it "turned around and backwards" just like the Holy Sea in the words for Holy Fire (Ha'esh) and Ca'esar and even in Dave's song ... "seven oceans pummel ... the wall of the C."  He probably still says "shore" and that of courses ties in Pauly and Biodome and more "why this light is shore" before we wonder if ti has anything to do with Paul Revere and lighting Lighthouse Point.

      TO A PALACE WHERE
THE BLIND CAN SEE

      So to point out the cost of not seeing "Holodeck" and "mushroom" and ... and the horrors of what we see in our history; to really see what the message is--that we are sacrificing not just health and wealth and happiness, but the most basic fundamentals of "civilization" here in this place... the freedom of logical thought and the foundational cement of open and honest communication--that it appears the world has decided in secret that these things are far less important than the morality of caring for those less fortunate than you--the blind and the sick and the ... to see the truth, it's a shame.  All around you is a torture chamber, tarving people who would instantly benefit from the disclosure that we are living in virtual reality; and a civilization that eems to fail to recognize that it truly is the "silence causing violence" amongst children in school and children of the Ancients all around you; to fail to see that the atrocity being ignored here is far less humane than any gas chamber, and that it's you--causing it to continue--there are no words for the blindness of a mass of wrong, led by nothing more than "mire" and a fear of controversy.

      Unhitched and unhinged, it's become ever more obvious that this resistance against recognizing logic and patterns--this fairure to speak and inability to fathom the importance of openness in this place that acts as the base and beginning point of a number of hidden futures--it is the reason "Brave New World" is kissing the "why" and the reason we are here trying to build a system that will allow for free and open communication in a sea of disinformation and darkness--to see that the battle is truly against the Majority Incapable of acting and the Minority unwilling to speak words that will without doubt (precarious? not at this point) quickly prove to the world that it's far more important to see that the truth protects everyone and the entire future from murder ... rather than be subtly influenced by "technologies undisclosed" into believing something as inane and arrogant as "everyone but you must need to be convinced that simulating murder and labor pains is wrong."  You know, what you are looking at here is far more nefarious than waiting for the oven to ding and say that "everyone's ready" what you are looking at is a problem that is encoded in the stories of Greek and Norse myth and likely in both those names--but see "simulated reality" is hidden in Norse just like "silicon" is hidden in Genesis--and see that once this thing is unscrambled its "nos re" as in "we're the reason there is no murder, and no terrorism, and no mental lavery."  It's a harsh message, and a horrible atrocity; but worse than the Holocaust is not connecting a failure to see "holodeck" as the cause of "holohell" and refusing to peak because Adam is naked in Genesis 3:11 and Matthew talks about something that should be spreading like wildfire in his 3:11 and that it's not just Live and it's not just the Cure and it's not just a band named 311 that show us that "[***FUKUSHIMA***](http://holies.org/HYAMDAI.html)" reads as "fuck you, see how I'm A" because this Silence, this failure to recognize that the Brit Hadashah is written to end simulated hell and turn this world into Heaven is the reason "that's great, it starts with an Earthquake on 3/11."

      XEROX THAT
HOUSTON, CASINEO\ You stand there believing that "to kiss" is a Toxic reason to end disease; that "mire" is a good enough reason to fail to exalt the Holiness of Phillip K. Dick's solutions; and still continue to refuse to see that this group behavior, this lack of freedom that you appear to believe is something of your own design is the most caustic thing of all.  While under the veil of "I'm not sure the message is accurate" it might seem like a morally thin line, but this message is accurate--and it's verifiable proof--and speaking about it would cause that verification to occur quicker, and that in turn will cause wounds to be healed faster, and the blind given sight and the lame a more effective ARMY in this legacy battle against hidden holorooms and ... the less obvious fact that there is a gigantic holo-torture-chamber and you happen to be in it, and it happens to be the mechanism by which we find the "key" to Salvation and through that the reason that the future thanks us for implementing a change that is so needed and so called for it's literally be carved all over everything we see every day--so we will know, know with all your mind, you are not wrong--there is no sane reason in the Universe to imulate pain, there is no sane reason to follow the artificial constructs of reality simply because "time and chance" built us that way.  We're growing up, beyond the infantile state of believing that simply because nobody has yet invented a better way to live--that we must shun and hide any indication that there is a future, and that it's speaking to us; in every word.

      THE VEIL OF
CASPERUS PAN

      So I've intimated that I see a "mood of the times" that appears to be seeking reality by pretending not to "CK" ... to seek "a," of course that puts us in a place where we are wholly denying what "reality" really means and that it delivers something good to the people here--to you--once we recognize that Heaven and Creation and Virtual Reality don't have to be (and never should be, ever again) synonymous with Wok's or Pan's or Ovens; from Peter to the Covenant, hiding this message is the beginning and the end of true darkness--it's a plan designed to ensure we never again have issue discussing "blatant truth" and means of moving forward to the light in the light with the light.  A girl in California in 2014 said something like "so there's no space, then?" in a snide and somewhat angry tone--there is space, you can see it through the windows in the skies, you can see the stars have lessened, and time has passed--and I'm sure you understand how "LHC" and Apollo 13 show us that time travel and dark matter are also part of this story of "Marshall's" and Slim Shady and Dave's "the walls and halls will fade away" and you might even understand how that connects to the astrological symbol of Mars and the "circle of the son" and of Venus(es) ... and you can see for yourself this Zeitgeist in the Truman Show's "good morning, good afternoon, good evening... and he's a'ight" ... but it really doesn't help us see that the darkness here isn't really in the sky--it's in our hearts--and it's the thing that's keeping us from the stars, and the knowledge and wisdom that will keep us from "bunting" instead of flourishing.

      TOT MARSH IT AL

      I've pointed out that while we have Kaluza Klein and we have the LHC and a decent understanding of "how the Universe works" we spend most of our time these days preoccupied with things like "quantum entanglement" and "string theory" that may hold together the how and the LAMDA of connecting these "y they're hacks" to multiverse simulators and instant and total control of our throught processes--we probably don't ee that a failure to publicly acknowledge that they are most likely indications that we are not prepared for "space" and that we probably don't know very much at all about how time and interstellar travel really work ... we are standing around hiding a message that would quicken our understanding of both reality and virtual reality and again, not seeing that kind of darkness--that inability to publicly "change directions" when we find out that there aren't 12 dimensions that are curled up on themselves with no real length or width or purpose other than to say "how unelegant is this anti-Razor of Mazer Rackham?"

      So, I think it's obvious but also that I need to point out the connection between "hiding knowledge of the Matrix" and the Holocaust; and refer you to the mirrored shield of Perseus, on a high level it appears that's "the message" there--that what's happening here ... whatever is causing this silence and delay in acting on even beginning to speak about the proof that will eventually end murder and cancer and death ... that it's something like stopping us from building a "loving caring house" rather than one that ... fills it's halls with bug spray instead of air conditioning.  I'm beside myself, and very sure that in almost no time at all we'll all agree that the idea of "simulating" these things that we detest--natural disasters and negative artifacts of biological life ... that it's inane and completely backwards.

      I understand there's trepidation, and you're worried that girls won't like my smile or won't think I'm funny enough... but I have firm belief in this message, in words like "precarious" that reads something like "before Icarus things were ... precarious" but more importantly my heart's reading of those words is to see that this has happened before and we are more than prepared to do it well.  I want nothing more than to see the Heavens help us make this transition better than one they went through, and hope beyond hope that we will thoroughly enjoy building a "better world" using tools that I know will make it simpler and faster to accomplish than we can even begin to imagine today.  

      On that note, I read more into the myths of Norse mythology and its connections to the Abrahamic religions; it appears to me that much of this message comes to us from the Jotunn (who I connect (in name and ...) to the Jinn of Islam, who it appears to me actually wrote the Koran) and in those stories I read that they believe their very existence is "depenedency linked" to the raising of the sunken city of Atlantis.  Even in the words depth and dependency you can see some hidden meaning, and what that implies to me is that we might actually be in a true time simulator (or perhaps "exits to reality" are conditional on waypoints like Atlantis); and that it's possible that they and God and Heaven are all actually all born ... here ... in this place.  

      While these might appear like fantastic ideas, you too can see that there's ample reference to them tucked away in mythology and in our dreams of utopia and the tools that bring it home ... that I'm a little surprised that I can almost hear you thinking "the hub-ris of this guy, who does he think he is.... suggesting that 'the wisdom to change everything' would be a significant improvement on the ending of the Serendipity Prayer."

      Really see that it's far more than "just disease and pain" ... what we are looking at in this darkness is really nothing short of the hidden slavery of our entire species, something hiding normal logical thought and using it to alter behavior ... throughout history ... the disclosure of the existence of a hidden technology that is in itself being used to stall or halt ... our very freedom from being achieved.  This is a gigantic deal, and I'm without any real understanding of what can be behind the complete lack of (cough ... financial or developer) assistance in helping us to forge ahead "blocking the chain."  I really am, it's not because of the Emperor's New Clothes... is it?

      It's also worth mentioning once again that I believe the stories of Apollo 13 and the LHC sort of explain how we've perhaps solved here problems more important than "being stuck on a single planet in a single star system" and bluntly told that the stories I've heard for the last few years about building a "bridge" between dark matter and here ... have literally come true while we've lived.  I suppose it adds something to the programmer/IRC hub admin "metaphor" to see that most likely we're in a significantly better position than we could have dreamed.  I've briefly written about this before ... my current beliefs put us somewhere within the Stargate SG-1 "dial home device/DHD" network.

      So... rumspringer, then? ... to help us "os!"

      DANCING ON THE GROUND, KISSING... ALL THE TIME

      Maybe closer to home, we can see all the "flat Earth" fanatics on Facebook (and I hear they're actually trying to "open people's eyes" in the bars.. these days) we might see how this little cult is really exactly that--it's a veritable honey pot of "how religion can dull the senses and the eyes" and we still probably fail to see very clearly that's exactly it's purpose--to show us that religion too is something that is evidence of this very same outside control--proof of the darkness, and that this particular "cult" is there to make that very clear.  Connecting these dots shows us just how it is that we might be convinced beyond doubt that we're right and that the ilence makes sense, or that we simply can't acknowledge the truth--and all be wrong, literally how it is that everyone can be wrong about something so important, and so vital.  It seems to me that the only real reason anyone with power or intelligence would willingly go along with this is to ... to force this place into reality--that's part of the story--the idea that we might do a "press and release in Taylor" (that's PRINT) where people maybe thought it was "in the progenitor Universe" -- but taking a step back and actually thinking, this technology that could be eliminating mental illness and depression and addiction and sadness and ... that this thing is something that's not at all possible to actually exist in reality.

      Image result for buffalo nickel

      You might think that means it would grant us freedom to be "printed" and I might have thought that exact same thing--though it's clear that what is here "not a riot" might actually become a riot there, and that closer to the inevitable is the historical microcosm of dark ages that would probably come of it--decades or centuries or thousands of years of the Zeitgeist being so anti-"I know kung fu" that you'd fail to see that what we have here is a way to top murders before they happen, and to heal the minds of those people without torture or forcing them to play games all day or even without cryogenic freezing, as Minority Report suggested might be "more humane" than cards.  Most likely we'd wind up in a place that shunned things like "engineering happiness" and fail to see just how dangerous the precipice we stand on really is.  I joke often about a boy in his basement making a kiss-box; but the truth is we could wind up in a world where Hamas has their own virtual world where they've taken control of Jerusalem and we could be in a place where Jeffrey Dammer has his own little world--and without some kind of "know everything how" we'd be sitting back in "ignorance is bliss" and just imagining that nobody would ever want to kidnap anyone or exploit children or go on may-lay killing sprees ... even though we have plenty of evidence that these things are most assuredly happening here, and again--we're not using the available tools we have to fix those problems.  Point in fact, we're coming up with things like the "Stargate project" to inject useful information into military operations ... "the locations of bunkers" ... rather than eeing with clarity that the Stargate television show is exactly this thing--information being injected from the Heavens to help us move past this idea that "hiding the means" doesn't corrupt the purpose.

      EARTH.

      Without knowledge and understanding of this technology, it's very possible we'd be running around like chickens with our heads cut off; in the place where that's the most dangerous thing that could happen--the place where we can't ensure there's safety and we can't ensure there's help ... and most of all we'd be doing it at a time when all we knew of these technologies was heinous usage; with no idea the wonders and the goodness that this thing that is most assuredly not a gun or a sword ... but a tool; no idea the great things that we could be doing instead of hiding that we just don't care. 

      We're being scared here for a reason, it's not just to see "Salem" in Jerusalem and "sale price" being attached to air and water; it's to see that we're going to be in a very important position, we already are--really--and that we need knowledge and patience and training and ... well, we need a desire to do the right thing; lest all will fall.

      o, you want to go to reality... but you think you'll get there without seeing "round" in "ground" and ... caring that there's tens of thousands of people that are sure that we live on flat Earth ... or that there's ghosts haunting good people, and your societal response is to pretend you don't know anything about ghosts, and to let the pharmacy prescribe harm ... effectively completing the sacrifice of the Temple of Doom; I assume because you want to go to a place where you too will be able to torment the young with "baby arcade" or ...

      i suppose there are those\ in the garden east of eden\ who'll follow the rose\ ignoring the toxicity of our city*and touch your nose\ as you continue chasing rabbits\ \ KEVORKIAN? TO
C YO, AD ... ARE I NIBIRU?

      *

      BUCK IS WISER

      ^22 ^The whole Israelite community set out from Kadesh and came to Mount Hor. ^23 ^At Mount Hor, near the border of Edom, the Lord said to Moses and Aaron, ^24 ^"Aaron will be gathered to his people. He will not enter the land I give the Israelites, because both of you rebelled against my command at the waters of Meribah. ^25 ^Get Aaron and his son Eleazar and take them up Mount Hor.  ^26 ^Remove Aaron's garments and put them on his son Eleazar, for Aaron will be gathered to his people; he will die there."

      O 5 S

      \ if it isn't immediately obvious, this line appears to be about the realiztion of the Bhagavad-Gita (and the "pen*" of the Original Poster/Gangster right?)

      ... swinging "the war"*

      p.s. ... I'm 37.

      so ... in light of the P.K. Dick solution to all of our problems ... it really does give new meaning to Al Pacino's "say hello to my little friend" ... amirite?

      Unless otherwise indicated, this work was written between the Christmas and Easter seasons of 2017 and 2020(A). The content of this page is released to the public under the GNU GPL v2.0 license; additionally any reproduction or derivation of the work must be attributed to the author, Adam Marshall Dobrin along with a link back to this website, fromthemachine dotty org.

      That's a "." not "dotty" ... it's to stop SPAMmers. :/

      This document is "living" and I don't just mean in the Jeffersonian sense. It's more alive in the "Mayflower's and June Doors ..." living Ethereum contract sense and literally just as close to the Depp/C[aster/Paglen (and honorably PK] 'D-hath Transundancesense of the ... new meaning; as it is now published on Rinkeby, in "living contract" form. It is subject to change; without notice anywhere but here--and there--in the original spirit of the GPL 2.0. We are "one step closer to God" ... and do see that in that I mean ... it is a very real fusion of this document and the "spirit of my life" as well as the Spirit's of Kerouac's America and Vonnegut's Martian Mars and my Venutian Hotel ... and my fusion of Guy-A and GAIA; and the Spirit of the Earth .. and of course the God given and signed liberties in the Constitution of the United States of America. It is by and through my hand that this document and our X Commandments link to the Bill or Rights, and this story about an Exodus from slavery that literally begins here, in the post-apocalyptic American hartland. Written ... this day ... April 14, 2020 (hey, is this HADAD DAY?) ... in Margate FL, USA. For "official used-to-v TAX day" tomorrow, I'm going to add the "immultible incarnite pen" ... if added to the living "doc/app"--see is the DAO, the way--will initi8 the special secret "hidden level" .. we've all been looking for.

  9. Oct 2024
    1. Comandos para gestionar y acceder a archivos: Cd: Mueve entre carpetas. Dir:: Lista el contenido de un directorio. Copy: Copia archivos. Robocopy: Versión mejorada del comando anterior (sincroniza directorios, funciona en segundo plano y puede reintentar automáticamente copias fallidas). Move: Mueve archivos. Del: Elimina archivos o contenido de carpetas. Rename : Renombra archivos. Format: Formatea una unidad de disco. Md: Crea carpetas. Tree: Muestra el árbol de directorios. Fc: Compara dos archivos o conjuntos

      3

    1. For example:

      One example of assistive technology for people with disabilities is screen readers. These are software applications designed to help individuals who are blind or visually impaired interact with digital content. Screen readers convert text on a screen into synthesized speech or braille output, allowing users to navigate websites, applications, and documents independently. Popular screen readers include JAWS (Job Access With Speech) and NVDA (NonVisual Desktop Access) for Windows, and VoiceOver on Apple devices.

    1. Services Trash & Recycling $(function () { var widgetContext = "widget_3_1222_1355"; //start VISPP-4466 var useDesignThemFontSizeCss = window.visionOptions.useDesignThemFontSizeCss; var folderPath = useDesignThemFontSizeCss == true ? window.visionOptions.currentDesignFolderPath : window.visionOptions.mainFolderPath; if (!folderPath) folderPath = window.visionOptions.mainFolderPath; var resizeTimer; var resizeFaqTabs = function () { $(".faq_widget").each(function () { var tabheight = $(this).find(".faq_tab_nav").height(); $(this).find(".faqtab_section").attr("style", "min-height: " + (tabheight - 42) + "px"); }); }; var SetFontSize = function (fontsize) { $("#active_font").attr("href", folderPath + fontsize).attr("data-css", fontsize); var url = window.location.origin + visionOptions.virtualApplicationPath + "Shared/ChangeFontSizeCookie"; var cookieValue = fontsize ? fontsize : "small.css"; var cookieInt; switch(cookieValue){ case("xx-small.css"): cookieInt=1; break; case ("x-small.css"): cookieInt = 2; break; case ("small.css"): default: cookieInt = 3; break; case("medium.css"): cookieInt=4; break; case ("large.css"): cookieInt = 5; break; } $.frontendAjax({ url: url, type: 'POST', contentType: 'application/json', data: JSON.stringify({ cookieValue: cookieInt}), success: function (data, textStatus, jqXHR) { if (data && data.success) { $("#active_font").attr("href", window.visionOptions.mainFolderPath + cookieValue).attr("data-css", cookieValue); } } }); if ($(".faq_tab_nav").length > 0) { clearTimeout(resizeTimer); resizeTimer = setTimeout(function () { resizeFaqTabs(); }, 200); } }; $(".font_larger").on("click", function () { switch ($("#active_font").attr("data-css")) { case "medium.css": SetFontSize("large.css"); break; case "small.css": SetFontSize("medium.css"); break; case "x-small.css": SetFontSize("small.css"); break; case "xx-small.css": SetFontSize("x-small.css"); break; } return false; }); $(".font_smaller").on("click", function () { switch ($("#active_font").attr("data-css")) { case "large.css": SetFontSize("medium.css"); break; case "medium.css": SetFontSize("small.css"); break; case "small.css": SetFontSize("x-small.css"); break; case "x-small.css": SetFontSize("xx-small.css"); break; } return false; }); $(".text_size").on("click", function () { SetFontSize("x-small.css"); return false; }); //end VISPP-4466 $("#" + widgetContext + " #share").click(function () { if (!$("#" + widgetContext + " div#share").hasClass("click-active")) $("#" + widgetContext + " div#share").find("ul").show(); else $("#" + widgetContext + " div#share").find("ul").hide(); }); $("div").click(function () { if ($("#" + widgetContext + " div#share").hasClass("click-active")) { $("#" + widgetContext + " div#share").find("ul").hide(); } }); $(document).click(function (e) { if (!$(e.target).closest("#share").length > 0) { $("#" + widgetContext + " div#share").removeClass("click-active"); $("#" + widgetContext + " div#share").find("ul").hide(); } }); var shareContainerTimeout = null; $("#" + widgetContext + " #share").bind('mouseover', function () { //If not relate to click event if (shareContainerTimeout) { clearTimeout(shareContainerTimeout); shareContainerTimeout = null; } shareContainerTimeout = setTimeout(function () { if (!$("#" + widgetContext + " div#share").hasClass("click-active")) $("#" + widgetContext + " div#share").find("ul").show(); }, 100); }); $("#" + widgetContext + " #share").bind('mouseleave', function () { //If not relate to click event if (shareContainerTimeout) { clearTimeout(shareContainerTimeout); shareContainerTimeout = null; } shareContainerTimeout = setTimeout(function () { if (!$("#" + widgetContext + " div#share").hasClass("click-active")) $("#" + widgetContext + " div#share").find("ul").hide(); }, 200); }); $("header#" + widgetContext + " a.feedback_link").click(function () { var windowHeight = 485; if (window.innerWidth <= 648) { windowHeight = 545; } var opts = { title: "Feedback", url: "/Template/GetFeedbackPartial?feedbackUrl=https%3a%2f%2fwww.clarkecounty.gov%2fservices%2ftrash-recycling", useFrame: true, height: windowHeight, onClosed: function (result) { if (result != undefined && result.IsOk == true) { $.refreshTempMessage(result.Message); } $("header#" + widgetContext + " a.feedback_link").focus(); }, skin: 'viClientDialog feedback_lightbox', fixed: false }; $.viClientDialog(opts).open(); }); //Safari iOS: No click event $("header#" + widgetContext + " a.send_share_email").bind("click touchstart", function () { var shareEmailTitle = document.itemTitle ? encodeURIComponent(document.itemTitle.trim()).replace(/[!'()*]/g, escape) : "Trash+%26+Recycling"; var opts = { title: "Click to submit an email online", url: "/Template/GetShareEmailPartial?shareUrl=https%3a%2f%2fwww.clarkecounty.gov%2fservices%2ftrash-recycling" + "&shareTitle=" + shareEmailTitle, useFrame: true, height: 485, onClosed: function (result) { if (result != undefined && result.IsOk == true) { $.refreshTempMessage(result.Message); } $("header#" + widgetContext + " a.send_share_email").focus(); }, skin: 'viClientDialog send_share_email_lightbox', fixed: false }; $.viClientDialog(opts).open(); }); }); Clarke County Convenience Center, located at 90 Quarry Rd. (Rt. 612) in the northeastern part of the county, is county operated for Clarke residents only. This facility is not for commercial use. The center accepts bagged household trash (10 bags maximum) and un-bagged recyclables. (See details below.) An attendant is always on site to assist residents, maintain the site, and ensure residents comply with posted policies. Find more details about local trash collection as well as where to dispose of hazardous materials, appliances, yard waste (including Christmas trees), etc. using the links at left. If using a smartphone, jump to subpage. The Quarry Road facility is open: • 3 to 7 p.m. Friday • 7 a.m. to 6 p.m. Saturday • 10 a.m. to 3 p.m. Sunday • 9 a.m. to 3 p.m. Monday Hours may change because of weather or other conditions. If use greatly increases, Clarke County may revise the schedule and open on other weekdays. Convenience Center is closed: • New Year’s Day • Easter Sunday • Memorial Day • Independence Day • Labor Day • Thanksgiving Day • Christmas Day Clarke County residents may use any of these six trash facilities: • 90 Quarry Rd., Berryville (operated by Clarke County) • 280 Landfill Rd., Winchester (operated by Frederick County) • 4201 Stonewall Jackson Hwy., White Post (operated by Frederick County) • 235 Hot Run Dr., Stephenson (operated by Frederick County) • 801 Greenwood Rd., Winchester (operated by Frederick County) • 47 Blue Mountain Rd., Front Royal (operated by Warren County) Clarke County Convenience Center has separate recycling containers for paper, cardboard, aluminum and steel cans, clean glass bottles and jars (with corks, caps, and lids removed), and plastic (#1 and #2). The facility does not accept plastics #3 through #7. The Convenience Center accepts clean glass bottles and jars for recycling. Residents must remove all corks, caps, and lids before placing glass in the container. Do NOT put mirrors, windows, heat-tempered glass such as Pyrex and mixing bowls, ceramic mugs and plates, wine glasses, or any trash (including plastic bags) in the recycling container. For glass recycling to continue in Clarke County, glass bottles and jars must be clean. Do not put plastic bags of any kind in any of the recycling containers. The Quarry Road facility does not accept yard waste, appliances, furniture, or hazardous materials of any kind. See “Yard Waste, Appliances & Hazardous Materials” link at left. If using a smartphone, jump to subpage.   Dumping trash of any kind on the ground or around the Clarke County Convenience Center property is prohibited and violators will be prosecuted. Illegal dumping constitutes a Class 1 misdemeanor punishable by a fine up to $2,500 and/or up to one year in imprisonment. The Clarke County Convenience Center site is under video surveillance 24/7. Town of Berryville provides trash pickup and recycling for residents and businesses within its town limits. Town of Boyce provides trash pickup for its residents.
    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Petty and Bruno investigate how response characteristics in the higher-order thalamic nuclei POm (typically somatosensory) and LP (typically visual) change when a stimulus (whisker air puff or visual drifting grating) of one or the other modality is conditioned to a reward. Using a two-step training procedure, they developed an elegant paradigm, where the distractor stimulus is completely uninformative about the reward, which is reflected in the licking behavior of trained mice. While the animals seem to take on to the tactile stimulus more readily, they can also associate the reward with the visual stimulus, ignoring tactile stimuli. In trained mice, the authors recorded single-unit responses in both POm and LP while presenting the same stimuli. The authors first focused on POm recordings, finding that in animals with tactile conditioning POm units specifically responded to the air puff stimulus but not the visual grating. Unexpectedly, in visually conditioned animals, POm units also responded to the visual grating, suggesting that the responses are not modality-specific but more related to behavioral relevance. These effects seem not be homogeneously distributed across POm, whereas lateral units maintain tactile specificity and medial units respond more flexibly. The authors further ask if the unexpected cross-modal responses might result from behavioral activity signatures. By regressing behavior-coupled activity out of the responses, they show that late activity indeed can be related to whisking, licking, and pupil size measures. However, cross-modal short latency responses are not clearly related to animal behavior. Finally, LP neurons also seem to change their modality-specificity dependent on conditioning, whereas tactile responses are attenuated in LP if the animal is conditioned to visual stimuli.

      The authors make a compelling case that POm neurons are less modality-specific than typically assumed. The training paradigm, employed methods, and analyses are mostly to the point, well supporting the conclusions. The findings importantly widen our understanding of higher-order thalamus processing features with the flexibility to encode multiple modalities and behavioral relevance. The results raise many important questions on the brain-wide representation of conditioned stimuli. E.g. how specific are the responses to the conditioned stimuli? Are thalamic cross-modal neurons recruited for the specific conditioned stimulus or do their responses reflect a more global shift of attention from one modality to another? 

      To elaborate on higher-order thalamic activity in relationship to conditioned behavior, a trialby-trial analysis would be very useful. Is neuronal activity predictive of licking and at which relative timing? 

      To elaborate on the relationship between neuronal activity and licking, we have created a new supplementary figure (Figure S1), where we present the lick latency of each mouse on the day of recording. We also perform more in-depth analysis of neural activity that occurs before lick onset, which is presented in a new main figure (new Figure 4). 

      Furthermore, I wonder why the (in my mind) major and from the data obvious take-away, "POm neurons respond more strongly to visual stimuli if visually conditioned", is not directly tested in the summary statistics in Figure 3h.

      We have added a summary statistic to Figure 3h and to the Results section (lines 156-157) comparing the drifting grating responses in visually and tactilely conditioned mice.  

      The remaining early visual responses in POm in visually conditioned mice after removing behavior-linked activity are very convincing (Figure 5d). It would help, however, to see a representation of this on a single-neuron basis side-by-side. Are individual neurons just coupled to behavior while others are independent, or is behaviorally coupled activity a homogeneous effect on all neurons on top of sensory activity?

      In lieu of a new figure, we have performed a new analysis of individual neurons to classify them as “stimulus tuned” and/or “movement tuned.” We find that nearly all POm cells encode movement and arousal regardless of whether they also respond to stimuli. This is presented in the Results under the heading “POm correlates with arousal and movement regardless of conditioning” (Lines 219-231).

      The conclusions on flexible response characteristics in LP in general are less strongly supported than those in POm. First, the differentiation between POm and LP relies heavily on the histological alignment of labeled probe depth and recording channel, possibly allowing for wrong assignment. 

      We appreciate the importance in differentiating between POm, LP, and surrounding regions to accurately assign a putative cell to a brain region. The method we employed (aligning an electrode track to a common reference atlas) is widely used in rodent neuroscience, especially in regions like POm and LP which are difficult to differentiate molecularly (for example, see Sibille, Nature Communications, 2022; and Schröder, Neuron, 2020). 

      Furthermore, it seems surprising, but is not discussed, that putative LP neurons have such strong responses to the air puff stimuli, in both conditioning cases. In tactile conditioning, LP air puff responses seem to be even faster and stronger than POm. In visual conditioning, drifting grating responses paradoxically seem to be later than in tactile conditioning (Fig S2e). These differences in response changes between POm and LP should be discussed in more detail and statements of "similar phenomena" in POm and LP (abstract) should be qualified.  

      We have further developed our analysis and discussion of LP activity. Our analysis of LP stimulus response latencies are now presented in greater detail in Figure S3, and we have expanded the results section accordingly (lines 266-275). We have also expanded the discussion section to both address these new analyses and speculate on what might drive these surprising “tactile responses” in LP.

      Reviewer #2 (Public Review): 

      Summary  

      This manuscript by Petty and Bruno delves into the still poorly understood role of higherorder thalamic nuclei in the encoding of sensory information by examining the activity in the Pom and LP cells in mice performing an associative learning task. They developed an elegant paradigm in which they conditioned head-fixed mice to attend to a stimulus of one sensory modality (visual or tactile) and ignore a second stimulus of the other modality. They recorded simultaneously from POm and LP, using 64-channel electrode arrays, to reveal the contextdependency of the firing activity of cells in higher-order thalamic nuclei. They concluded that behavioral training reshapes activity in these secondary thalamic nuclei. I have no major concerns with the manuscript's conclusions, but some important methodological details are lacking and I feel the manuscript could be improved with the following revisions.

      Strengths 

      The authors developed an original and elegant paradigm in which they conditioned headfixed mice to attend to a stimulus of one sensory modality, either visual or tactile, and ignore a second stimulus of the other modality. As a tactile stimulus, they applied gentle air puffs on the distal part of the vibrissae, ensuring that the stimulus was innocuous and therefore none aversive which is crucial in their study. 

      It is commonly viewed that the first-order thalamus performs filtering and re-encoding of the sensory flow; in contrast, the computations taking place in high-order nuclei are poorly understood. They may contribute to cognitive functions. By integrating top-down control, high-order nuclei may participate in generating updated models of the environment based on sensory activity; how this can take place is a key question that Petty and Bruno addressed in the present study.

      Weaknesses  

      (1) Overall, methods, results, and discussion, involving sensory responses, especially for the Pom, are confusing. I have the feeling that throughout the manuscript, the authors are dealing with the sensory and non-sensory aspects of the modulation of the firing activity in the Pom and LP, without a clear definition of what they examined. Making subsections in the results, or a better naming of what is analyzed could convey the authors' message in a clearer way, e.g., baseline, stim-on, reward.  

      We thank Reviewer 2 for this suggestion. We have adjusted the language throughout the paper to more clearly state which portions of a given trial we analyzed. We now consistently refer to “baseline,” “stimulus onset,” and “stimulus offset” periods. 

      In line #502 in Methods, the authors defined "Sensory Responses. We examined each cell's putative sensory response by comparing its firing rate during a "stimulus period" to its baseline firing rate. We first excluded overlapping stimuli, defined as any stimulus occurring within 6 seconds of a stimulus of a different type. We then counted the number of spikes that occurred within 1 second prior to the onset of each stimulus (baseline period) and within one second of the stimulus onset (stimulus period). The period within +/-50ms of the stimulus was considered ambiguous and excluded from analysis." 

      Considering that the responses to whisker deflection, while weak and delayed, were shown to occur, when present, before 50 ms in the Pom (Diamond et al., 1992), it is not clear what the authors mean and consider as "Sensory Responses"? 

      We have addressed this important concern in three ways. First, we have reanalyzed our data to include the 50ms pre- and post-stimulus time windows that were previously excluded. This did not qualitatively change our results, but updated statistical measurements are reflected in the Results and the legends of figures 3 and 7. Second, we have created a new figure (new Figure 4) which provides a more detailed analysis of early POm stimulus responses at a finer time scale. Third, we have amended the language throughout the paper to refer to “stimulus responses” rather than “sensory responses” to reflect how we cannot disambiguate between bottom-up sensory input and top-down input into POm and LP with our experimental setup. We refer only to “putative sensory responses” when discussing lowlatency (<100ms) stimulus responses.

      Precise wording may help to clarify the message. For instance, line #134: "Of cells from tactilely conditioned mice, 175 (50.4%) significantly responded to the air puff, as defined by having a firing rate significantly different from baseline within one second from air puff onset (Figure 3d, bottom)", could be written "significantly responded to the air puff" should be written "significantly increased (or modified if some decreased) their firing rate within one second after the air puff onset (baseline: ...)". This will avoid any confusion with the sensory responses per se.

      We have made this specific change suggested by the reviewer (lines 145-146) and made similar adjustments to the language throughout the manuscript to better communicate our analysis methods. 

      (2) To extend the previous concern, the latency of the modulation of the firing rate of the Pom cells for each modality and each conditioning may be an issue. This latency, given in Figure S2, is rather long, i.e. particularly late latencies for the whisker system, which is completely in favor of non-sensory "responses" per se and the authors' hypothesis that sensory-, arousal-, and movement-evoked activity in Pom are shaped by associative learning. Latency is a key point in this study. 

      Therefore, 

      - latencies should be given in the main text, and Figure S2 could be considered for a main figure, at least panels c, d, and e, could be part of Figure 3. 

      - the Figure S2b points out rather short latency responses to the air puff, at least in some cells, in addition to late ones. The manuscript would highly benefit from an analysis of both early and late latency components of the "responses" to air puffs and drafting grating in both conditions. This analysis may definitely help to clarify the authors' message. Since the authors performed unit recordings, these data are accessible.

      - it would be highly instructive to examine the latency of the modulation of Pom cells firing rate in parallel with the onset of each behavior, i.e. modification of pupil radius, whisking amplitude, lick rate (Figures 1e, g and 3a, b). The Figure 1 does not provide the latency of the licks in conditioned mice.

      - the authors mention in the discussion low-latency responses, e.g., line #299: "In both tactilely and visually conditioned mice, movement could not explain the increased firing rate at air puff onset. These low-latency responses across conditioning groups is likely due in part to "true" sensory responses driven by S1 and SpVi."; line #306: "Like POm, LP displayed varied stimulus-evoked activity that was heavily dependent on conditioning. LP responded to the air puff robustly and with low latency, despite lacking direct somatosensory inputs."  But which low-latency responses do the authors refer to? Again, this points out that a robust analysis of these latencies is missing in the manuscript but would be helpful to conclude.

      We have moved our analysis of stimulus response latency in POm to new Figure 4 in the main text and have expanded both the Results and Discussion sections accordingly. We have also analyzed the lick latency on the day of recording, included in a new supplemental Figure S1. 

      (3) Anatomical locations of recordings in the dorsal part of the thalamus. Line #122 "Our recordings covered most of the volume of POm but were clustered primarily in the anterior and medial portions of LP (Figure 2d-f). Cells that were within 50 µm of a region border were excluded from analysis." 

      How did the authors distinguish the anterior boundary of the LP with the LD nucleus just more anterior to the LP, another higher-order nucleus, where whisker-responsive cells have been isolated (Bezdudnaya and Keller, 2008)? 

      Cells within 50µm of any region boundary were excluded, including those at the border of LP and LD. We also reviewed our histology images by eye and believe that our recordings were all made posterior of LD. 

      (4) The mention in the Methods about the approval by an ethics committee is missing.  All the surgery (line #381), i.e., for the implant, the craniotomy, as well as the perfusion, are performed under isoflurane. But isoflurane induces narcosis only and not proper anesthesia. The mention of the use of analgesia is missing. 

      We thank Reviewer 2 for drawing our attention to this oversight. All experiments were conducted under the approval of the Columbia University IACUC. Mice were treated with the global analgesics buprenorphine and carprofen, the local analgesic bupivacaine, and anesthetized with isoflurane during all surgical procedures. We have amended the Methods section to include this information (Lines 458-470).

      Reviewer #3 (Public Review): 

      Petty and Bruno ask whether activity in secondary thalamic nuclei depends on the behavioral relevance of stimulus modality. They recorded from POm and LP, but the weight of the paper is skewed toward POm. They use two cohorts of mice (N=11 and 12), recorded in both nuclei using multi-electrode arrays, while being trained to lick to either a tactile stimulus (air puff against whiskers, first cohort) or a visual stimulus (drifting grating, second cohort), and ignore the respective other. They find that both nuclei, while primarily responsive to their 'home' modality, are more responsive to the relevant modality (i.e. the modality predicting reward). 

      Strengths: 

      The paper asks an important question, it is timely and is very well executed. The behavioral method using a delayed lick index (excluding impulsive responses) is well worked out. Electrophysiology methods are state-of-the-art with information about spike quality in Figure S1. The main result is novel and important, convincingly conveying the point that encoding of secondary thalamic nuclei is flexible and clearly includes aspects of the behavioral relevance of a stimulus. The paper explores the mapping of responses within POm, pointing to a complex functional structure, something that has been reported/suggested in earlier studies. 

      Weaknesses: 

      Coding: It does not become clear to which aspect of the task POm/LP is responding. There is a motor-related response (whisking, licking, pupil), which, however, after regressing it out leaves a remaining response that the authors speculate could be sensory.

      Learning: The paper talks a lot about 'learning', although it is only indirectly addressed. The authors use two differently (over-)trained mice cohorts rather than studying e.g. a rule switch in one and the same mouse, which would allow us to directly assess whether it is the same neurons that undergo rule-dependent encoding. 

      We disagree that our animals are “overtrained,” as every mouse was fully trained within 13 days. We agree that it would be interesting to study a rule-switch type experiment, but such an experiment is not necessary to reveal the profound effect that conditioning has on stimulus responses in POm and LP. 

      Mapping: The authors treat and interpret the two nuclei very much in the same vein, although there are clear differences. I would think these differences are mentioned in passing but could be discussed in more depth. Mapping using responses on electrode tracks is done in POm but not LP.

      The mapping of LP responses by anatomical location is presented in the supplemental Figure S4 (previously S3). We have expanded our discussion of LP and how it might differ from POm.

      Reviewer #1 (Recommendations For The Authors):  

      Minor writing issues: 

      122 ...67 >LP< cells?

      301 plural "are”

      We have fixed these typos.

      Figure issues

      *  3a,b time ticks are misaligned and the grey bar (bottom) seems not to align with the visual/tactile stimulus shadings.

      *  legend to Figure 3b refers to Figure 1c which is a scheme, but if 1g is meant, this mouse does not seem to have a session 12? 

      *  3c,e time ticks slightly misaligned. 

      *  5e misses shading for the relevant box plots, assuming it should be like Figure 3h.  

      We thank Reviewer 1 for pointing out these errors. We have adjusted Figures 1, 3, and 5 accordingly.

      Analyses 

      I am missing a similar summary statistics for LP as in Figure 3h 

      We have added a summary box chart of LP stimulus responses (Figure 7g), similar to that of POm in Figure 3. We have also performed similar statistical analyses, the results of which are presented in the legend for Figure 7. 

      Reviewer #2 (Recommendations For The Authors): 

      More precisions are required for the following points: 

      (1) The mention of the use of analgesia is missing and this is not a minor concern. Even if the recordings are performed 24 hours after the surgery for the craniotomy and screw insertion and several days after the main surgery for the implant, taking into account the pain of the animals during surgeries is crucial first for ethical reasons, and second because it may affect the data, especially in Pom cells: pain during surgery may induce the development of allodynia and/or hyperalgesia phenomenae and Pom responses to sensory stimuli were shown to be more robust in behavioral hyperalgesia (Masri et al., 2009).  

      We neglected to include details on the analgesics used during surgery and post-operation recovery in our original manuscript. Mice were administered buprenorphine, carprofen, and bupivacaine immediately prior to the head plate surgery and were treated with additional carprofen during recovery. Mice were similarly treated with analgesics for the craniotomy procedure. Mice were carefully observed after craniotomy, and we saw no evidence of pain or discomfort. Furthermore, mice performed the behavior at the same level pre- and postcraniotomy (now presented in Figure 1j), which also indicates that they were not in any pain. 

      (2) The head-fixed preparation is only poorly described.

      Line #414: "Prior to conditioning, mice were habituated to head fixation and given ad libitum water in the behavior apparatus for 15-25 minutes." 

      And line #425 "Mice were trained for one session per day, with each session consisting of an equal number of visual stimuli and air puffs. Sessions ranged from 20-60 minutes and about 40-120 of each stimulus. " 

      More details should be given about the head-fixation training protocol. Are 15-25 minutes the session time duration, 60 minutes, or other time duration? How long does it take to get mice well trained to the head fixation, and on which criteria?  

      Line #389: "Mice were then allowed to recover for 24 hours, after which the sealant was removed and recordings were performed. At the end of experiments,"

      The timeline is not clear: is there one day or several days of recordings? 

      We have expanded on our description of the head fixation protocol in the Methods. We describe in more detail how mice were habituated to head fixation, the timing of water restriction, and the start of conditioning/training (Habituation and Conditioning, lines 492-500).

      (4) Line #411: "Mice were deprived of water 3 days prior to the start of conditioning" followed by line #414 "Prior to conditioning, mice were habituated to head fixation and given ad libitum water in the behavior apparatus for 15-25 minutes".

      If I understood correctly, the mice were then not fully water-deprived for 3 days since they received water while head-fixed. This point may be clarified. 

      We addressed these concerns in the changes to the Methods section mentioned in the preceding point (3).

      (5) Line #157: "Modality selectivity varies with anatomical location in Pom" while the end of the previous paragraph is "This suggests that POm encoding of reward and/or licking is insensitive to task type, an observation we examine further below."

      The authors then come to anatomical concerns before coming back to what the Pom may encode in the following section. This makes the story quite confusing and hard to follow even though pretty interesting.  

      We have reordered our Figures and Results to improve the flow of the paper and remove this point of confusion. We now present results on the encoding of movement before analyzing the relationship between POm stimulus responses and anatomical location. What was old Figure 5 now precedes what was old Figure 4.

      (6) Licks Analysis. Line #99 "However, this mouse also learned that the air puff predicted a lack of reward in the shaping task, as evidenced by withholding licking upon the onset of the air puff. The mouse thus displayed a positive visual lick index and a negative tactile lick index, suggesting that it attended to both the tactile and visual stimuli (Figure 1f, middle arrow)."

      Line #105 "All visually conditioned mice exhibited a similar learning trajectory (Figure 1i left, 1j left)". 

      Interestingly, the authors revealed that mice withheld licking upon the onset of the air puff in the visual conditioning, which they did not do at the onset of the drifting grating in the tactile conditioning. This withholding was extinguished after the 8th session, which the authors interpret as the mice finally ignoring the air puff. Is this effect significant, is there a significant withholding licking upon the onset of the air puff on the 12 tested mice? 

      The withholding of licking was significant (assessed with a sign-rank test) in visually conditioned mice prior to switching to the full version of the task. Indeed, it was the abolishment of this effect after conditioning with the full version of the task that was our criterion for when a mouse was fully trained. We have elaborated on this in the Habituation and Conditioning section in the Methods.

      (1) Throughout the manuscript "Touch" is used instead of passive whisker deflection, and may be confusing with "active touch" for the whisker community readers. I recommend avoiding using "touch" instead of "passive whisker deflection".

      We appreciate that “touch” can be an ambiguous term in some contexts. However, we have limited our use of the word to refer to the percept of whisker deflection; we do not describe the air puff stimulus as a “touch.” We respectfully would like to retain the use of the word, as it is useful for comparing somatosensory stimuli to visual stimuli.

      (2) Line #395: "Air puffs (0.5-1 PSI) were delivered through a nozzle (cut p1000 pipet tip, approximately 3.5mm diameter aperture)".

      Are air puffs of <1 PSI applied, not <1 bar?  

      We thank Reviewer 3 for pointing out this inaccuracy. The air puffs were indeed between 0.5 and 1 bar, not PSI. We have addressed this in the Methods.

      (3) Line #441: "In the full task, the stimuli and reward were identical, but stimuli were presented at uncorrelated and less predictable intervals."  Do the authors mean that all stimuli are rewarded?  

      The stimuli and reward were identical between the shaping and full versions of the task. In the full version of the task, the unrewarded stimulus was truly uncorrelated with reward, rather than anticorrelated. 

      (4) Line #445 "for a mean ISI of 20 msec." ISI is not defined, I guess that it means interstimulus interval. Even if pretty obvious, to avoid any confusion for future readers, I would recommend using another acronym, especially in a manuscript about electrophysiology, since ISI is a dedicated acronym for inter-spike interval. 

      We have defined the acronym ISI as “inter-stimulus interval” when first introduced in the results (Line 82) and in the Methods (Line 511).

      (5) Line #416 "In the first phase of conditioning ("shaping"), mice were separated into two cohorts: a "tactile" cohort and a "visual" cohort. Mice were presented with tactile stimuli (a two-second air puff delivered to the distal whisker field) and visual stimuli (vertical drifting grating on a monitor). Throughout conditioning, mice were monitored via webcam to ensure that the air puff only contacted the whiskers and did not disturb the facial fur nor cause the mouse to blink, flinch, or otherwise react - ensuring the stimulus was innocuous. The stimulus types were randomly ordered. In the visual conditioning cohort, the visual stimulus was paired with a water reward (8-16µL) delivered at the time of stimulus offset. In the tactile conditioning cohort, the reward was instead paired with the offset of the air puff. Regardless of the type of conditioning, stimulus type was a balanced 50:50 with an inter-stimulus interval of 8-12 seconds (uniform distribution)." 

      The mention of the "full version of the task" will be welcome in this paragraph to clarify what the task is for the mouse in the Methods part.

      We have more clearly defined the full version of the task in a later paragraph (line 506). We believe this addresses the potential confusion caused by the original description of the conditioning paradigm. 

      (6) Line #467: "Units were assigned to the array channel on which its mean waveform was largest". 

      Should it read mean waveform "amplitude"? 

      This is correct, we have adjusted the statement accordingly. 

      (7) Line #482 "The eye camera was positioned on the right side of the face and recorded at 60 fps." Then line #487 "The trace of pupil radius over time was smoothed over 5 frames (8.3 msec).” 5 frames, with a 60fps, represent then 83 ms and not 8.3 ms.

      We have corrected this error.  

      (8) Line #121: "257 POm cells and 67 cells from 12 visually conditioned mice" 

      67 LP cells, LP is missing 

      We have corrected this error. 

      (9) Line #354: "A consistent result of attention studies in humans and nonhuman primates is the enhancement of cortical and thalamic sensory responses to an attended visual stimuli. Here, we show not just enhancement of sensory responses to stimuli within a single modality, but also across modalities. It is worth investigating further how secondary thalamus and high-order sensory cortex encode attention to stimuli outside of their respective modalities. Our surprising conclusion that the nuclei are equivalently activated by behaviorally relevant stimuli is nevertheless compatible with these previous studies."  Since higher-order thalamic nuclei are integrative centers of many cortical and subcortical inputs, they cannot be viewed simply as relay nuclei, and there is therefore no "surprising" conclusion in these results. Not surprising, but still an elegant demonstration of the contextdependent activity/responses of the Pom/LP cells. 

      We disagree. Visual stimuli activating strong POm responses and tactile stimuli activating strong LP responses - however they do it - is a surprising result. We agree that higher-order thalamic nuclei are integrative centers, but exactly what they integrate and what the integrated output means is still poorly understood.

    1. Reviewer #3 (Public review):

      Summary:

      Loewinger et al. extend a previously described framework (Cui et al., 2021) to provide new methods for statistical analysis of fiber photometry data. The methodology combines functional regression with linear mixed models, allowing inference on complex study designs that are common in photometry studies. To demonstrate its utility, they reanalyze datasets from two recent fiber photometry studies into mesolimbic dopamine. Then, through simulation, they demonstrate the superiority of their approach compared to other common methods.

      Strengths:

      The statistical framework described provides a powerful way to analyze photometry data and potentially other similar signals. The provided package makes this methodology easy to implement and the extensively worked examples of reanalysis provide a useful guide to others on how to correctly specify models.

      Modeling the entire trial (function regression) removes the need to choose appropriate summary statistics, removing the opportunity to introduce bias, for example in searching for optimal windows in which to calculate the AUC. This is demonstrated in the re-analysis of Jeong et al., 2022, in which the AUC measures presented masked important details about how the photometry signal was changing. There is an appropriate level of discussion of the interpretation of the reanalyzed data that highlights the pitfalls of other methods and the usefulness of their methods.

      The authors' use of linear mixed methods, allows for the estimation of random effects, which are an important consideration given the repeated-measures design of most photometry studies.

      The authors provide a useful guide for how to practically use and implement their methods in an easy-to-use package. These methods should have wide applicability to those who use photometry or similar methods. The development of this excellent open-source software is a great service to the wider neuroscience community.

    1. Author response:

      We thank the editor and reviewers for their feedback. We believe we can address the substantive criticisms in full, first, by providing a more explicit theoretical basis for the method. Then, we believe criticism based on assumptions about phase consistency across time points are not well founded and can be answered. Finally, in response to some reviewer comments, we will improve the surrogate testing of the method.

      We will enhance the theoretical justification for the application of higher-order singular value decomposition (SVD) to the problem of irregular sampling of the cortical area. The initial version of the manuscript was written to allow informal access to these ideas (if possible), but the reviewers find a more rigorous account appropriate. We will add an introduction to modern developments in the use of functional SVD in geophysics, meteorology & oceanography (e.g., empirical orthogonal functions) and quantitative fluid dynamics (e.g., dynamic mode decomposition) and computational chemistry. Recently SVD has been used in neuroscience studies (e.g., cortical eigenmodes). To our knowledge, our work is the first time higher-order SVD has been applied to a neuroscience problem. We use it here to solve an otherwise (apparently) intractable problem, i.e., how to estimate the spatial frequency (SF) spectrum on a sparse and highly irregular array with broadband signals.

      We will clarify the methodological strategy in more formal terms in the next version of the paper. But essentially SVD allows a change of basis that greatly simplifies quantitative analysis. Here it allows escape from estimating the SF across millions of data-points (triplets of contacts, at each sample), each of which contains multiple overlapping signals plus noise (noise here defined in the context of SF estimation) and are inter-correlated across a variety of known and unknown observational dimensions. Rather than simply average over samples, which would wash out much of the real signal, SVD allows the signals to be decomposed in a lossless manner (up to the choice of number of eigenvectors at which the SVD is truncated). The higher-order SVD we have implemented reduces the size of problem to allow quantification of SF over hundreds of components, each of which is guaranteed certain desirable properties, i.e., they explain known (and largest) amounts of variance of the original data and are orthonormal. This last property allows us to proceed as if the observations are independent. SF estimates are made within this new coordinate system.

      We will also more concretely formalise the relation between Fourier analysis and previous observations of eigenvectors of phase that are smooth gradients.

      We will very briefly review Fourier methods designed to deal with non-uniform sampling. The problems these methods are designed for fall into the non-uniform part of the spectrum from uniform–non-uniform–irregular–highly-irregular–noise. They are highly suited to, for example, interpolating between EEG electrodes to produce a uniform array for application of the fast Fourier transform (Alamia et al., 2023). However, survey across a range of applied maths fields suggests that no method exists for the degree of irregular sampling found in the sEEG arrays at issue here. In particular, the sparseness of the contact coverage presents an insurmountable hurdle to standard methods. While there exists methods for sparse samples (e.g., Margrave & Fergusen, 1999; Ying 2009), these require well-defined oscillatory behavior, e.g., for seismographic analysis. Given the problems of highly irregular sampling, sparseness of sampling and broadband, nonstationary signals, we have attempted a solution via the novel methods introduced in the current manuscript. We were able to leverage previous observations regarding the relation between eigenvectors of cortical phase and Fourier analysis, as we outline in the manuscript.

      We will extend the current 1-dimensional surrogate data to better demonstrate that the method does indeed correctly detect the ordinal relations in power on different parts of the SF spectrum. We will include the effects of a global reference signal. Simulations of cortical activity are an expensive way to achieve this goal. While the first author has published in this area, such simulations are partly a function of the assumptions put into them (i.e., spatial damping, boundary conditions, parameterization of connection fields). We will therefore use surrogate signals derived from real cortical activity to complete this task.

      Some more specific issues raised:<br /> (1) Application of the method to general neuroscience problems:<br /> The purpose of the manuscript was to estimate the SF spectrum of phase in the cortex, in the range where it was previously not possible. The purpose was not specifically to introduce a new method of analysis that might be immediately applicable to a wide range of available data-sets. Indeed, the specifics of the method are designed to overcome an otherwise intractable disadvantage of sEEG (irregular spatial sampling) in order to take advantage of its good coverage (compared to ECoG) and low volume conduction compared to extra-cranial methods. On the other hand, the developing field of functional SVD would be of interest to neuroscientists, as a set of methods to solve difficult problems, and therefore of general interest. We will make these points explicit in the next version of the manuscript. In order to make the method more accessible, we will also publish code for the key routines (construction of triplets of contacts, Morlet wavelets, calculation of higher-order SVD, calculation of SF).

      (2) Novelty:<br /> We agree with the third reviewer: if our results can convince, then the study will have an impact on the field. While there is work that has been done on phase interactions at a variety of scales, such as from the labs of Fries, Singer, Engels, Nauhaus, Logothetis and others, it does not quantify the relative power of the different spatial scales. Additionally, the research of Freeman et al. has quantified only portions of the SF spectrum of the cortex, or used EEG to estimate low SFs. We would appreciate any pointers to the specific literature the current research contributes to, namely, the SF spectrum of activity in the cortex.

      (3) Further analyses:<br /> The main results of the research are relatively simple: monotonically falling SF-power with SF; this effect occurs across the range of temporal frequencies. We provide each individual participant’s curves in the supplementary Figures. By visual inspection, it can be seen that the main result of the example participant is uniformly recapitulated. One is rarely in this position in neuroscience research, and we will make this explicit in the text.

      The research stands or falls by the adequacy of the method to estimate the SF curves. For this reason most statistical analyses and figures were reserved for ruling out confounds and exploring the limits of the methods. However, for the sake of completeness, we will now include the SF vs. SF-power correlations and significance in the next version, for each participant at each frequency.

      Since the main result was uniform across participants, and since we did not expect that there was anything of special significance about the delayed free recall task, we conclude that more participants or more tasks would not add to the result. As we point out in the manuscript, each participant is a test of the main hypothesis. The result is also consistent with previous attempts to quantify the SF spectrum, using a range of different tasks and measurement modalities (Barrie et al., 1996; Ramon & Holmes 2015; Alexander et al., 2019; Alexander et al., 2016; Freeman et al., 2003; Freeman et al. 2000). The search for those rare sEEG participants with larger coverage than the maximum here is a matter of interest to us, but will be left for a future study.

      (4) Sampling of phase and its meaningfulness:<br /> The wavelet methods used in the present study have excellent temporal resolution but poor frequency resolution. We additionally oversample the frequency range to produce visually informative plots (usually in the context of time by frequency plots, see Alexander et al., 2006; 2013; 2019). But it is not correct that the methods for estimating phase assume a narrow frequency band. Rather, the poor frequency resolution of short time-series Morlet wavelets means the methods are robust to the exact shape of the waveforms; the signal need be only approximately sinusoidal; to rise and fall. The reason for using methods that have excellent resolution in the time-domain is that previous work (Alexander et al., 2006; Patten et al. 2012) has shown that traveling wave events can last only one or two cycles, i.e., are not oscillatory in the strict sense but are non-stationary events. So while short time-window Morlet wavelets have a disadvantage in terms of frequency resolution, this means they precisely do not have the problem of assuming narrow-band sinusoidal waveforms in the signal. We strongly disagree that our analysis requires very strong assumptions about oscillations (see last point in this section).

      Our hypothesis was about the SF spectrum of the phase. When the measurement of phase is noise-like at some location, frequency and time, then this noise will not substantially contribute to the low SF parts of the spectrum compared to high SFs. Our hypothesis also concerned whether it was reasonable to interpret the existing literature on low SF waves in terms of cortically localised waves or small numbers of localised oscillators. This required us to show that low SFs dominate, and therefore that this signal must dominate any extra-cranial measurements of apparent low SF traveling waves. It does not require us to demonstrate that the various parts of the SF spectrum are meaningful in the sense of functionally significant. This has been shown elsewhere (see references to traveling waves in manuscript, to which we will also add a brief survey of research on phase dynamics).

      The calculation of phase can be bypassed altogether to achieve the initial effect described in the introduction to the methods (Fourier-like basis functions from SVD). The observed eigenvectors, increasing in spatial frequency with decreasing eigenvalues, can be reproduced by applying Gaussian windows to the raw time-series (D. Alexander, unpublished observation). For example, undertaking an SVD on the raw time-series windowed over 100ms reproduces much the same spatial eigenvectors (except that they come in pairs, recapitulating the real and imaginary parts of the signal). This reproducibility is in comparison to first estimating the phase at 10Hz using Morlet wavelets, then applying the SVD to the unit-length complex phase values.

      (5) Other issues to be addressed and improved:<br /> clarity on which experiments were analyzed (starting in the abstract) discussion of frequencies above 60Hz and caution in interpretation due to spike-waveform artefact or as a potential index of multi-unit spiking discussion of whether the ad hoc, quasi-random sampling achieved by sEEG contacts somehow inflates the low SF estimates

      References (new)<br /> Patten TM, Rennie CJ, Robinson PA, Gong P (2012) Human Cortical Traveling Waves: Dynamical Properties and Correlations with Responses. PLoS ONE 7(6): e38392. https://doi.org/10.1371/journal.pone.0038392<br /> Margrave GF, Ferguson RJ (1999) Wavefield extrapolation by nonstationary phase shift, GEOPHYSICS 64:4, 1067-1078<br /> Ying Y (2009) Sparse Fourier Transform via Butterfly Algorithm SIAM Journal on Scientific Computing, 31:3, 1678-1694

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study investigates a dietary intervention that employs a smartphone app to promote meal regularity, which may be useful. Despite no observed changes in caloric intake, the authors report significant weight loss. While the concept is very interesting and deserves to be studied due to its potential clinical relevance, the study's rigor needs to be revised, notably for its reliance on self-reported food intake, a highly unreliable way to assess food intake. Additionally, the study theorizes that the intervention resets the circadian clock, but the study needs more reliable methods for assessing circadian rhythms, such as actigraphy.

      Thank you for the positive yet critical feedback on our manuscript. We are pleased with the assessment that our study is very interesting and deserves to be continued. We have addressed the points of criticism mentioned and discussed the limitations of the study in more detail in the revised version than before.

      Nevertheless, we would like to note that one condition for our study design was that the participants were able to carry out the study in their normal everyday environment. This means that it is not possible to fully objectively record food intake - especially not over a period of eight weeks. In our view, self-reporting of food intake is therefore unavoidable and also forms the basis of comparable studies on chrononutrition. We believe that recording data with a smartphone application at the moment of eating is a reliable means of recording food consumption and is better suited than questionnaires, for example, which have to be completed retrospectively. Objectivity could be optimized by transferring photographs of the food consumed. However, even this only provides limited protection against underreporting, as photos of individual meals, snacks, or second servings could be omitted by the participants. Sporadic indirect calorimetric measurements can help to identify under-reporting, but this cannot replace real-time self-reporting via smartphone application.

      Our data show that at the behavioral level, the rhythms of food intake are significantly less variable during the intervention. Our assumption that precise mealtimes influence the circadian rhythms of the digestive system is not new and has been confirmed many times in animal and human studies. It can therefore be assumed that comparable effects also apply to the participants in our study. Of course, a measurement of physiological rhythms is also desirable for a continuation of the study. However, we suspect that cellular rhythms in tissues of the digestive tract in particular are decisive for the changes in body weight. The characterization of these rhythms in humans is at best indirectly possible via blood factors. Reduced variability of the sleep-wake rhythm, which is measured by actigraphy, may result from our intervention, but in our view is not the decisive factor for the optimization of metabolic processes.

      We have addressed the specific comments and made changes to the manuscript as indicated below.

      Reviewer #1 (Public Review):

      The authors Wilming and colleagues set out to determine the impact of regularity of feeding per se on the efficiency of weight loss. The idea was to determine if individuals who consume 2-3 meals within individualized time frames, as opposed to those who exhibit stochastic feeding patterns throughout the circadian period, will cause weight loss.

      The methods are rigorous, and the research is conducted using a two-group, single-center, randomized-controlled, single-blinded study design. The participants were aged between 18 and 65 years old, and a smartphone application was used to determine preferred feeding times, which were then used as defined feeding times for the experimental group. This adds strength to the study since restricting feeding within preferred/personalized feeding windows will improve compliance and study completion. Following a 14-day exploration phase and a 6-week intervention period in a cohort of 100 participants (inclusive of both the controls and the experimental group that completed the study), the authors conclude that when meals are restricted to 45min or less durations (MTVS of 3 or less), this leads to efficient weight loss. Surprisingly, the study excludes the impact of self-reported meal composition on the efficiency of weight loss in the experimental group. In light of this, it is important to follow up on this observation and develop rigorous study designs that will comprehensively assess the impact of changes (sustained) in dietary composition on weight loss. The study also reports interesting effects of regularity of feeding on eating behavior, which appears to be independent of weight loss. Perhaps the most important observation is that personalized interventions that cater to individual circadian needs will likely result in more significant weight loss than when interventions are mismatched with personal circadian structures.

      We would like to thank the reviewer for the positive assessment of our study.

      (1) One concern for the study is its two-group design; however, single-group cross-over designs are tedious to develop, and an adequate 'wash-out' period may be difficult to predict.

      A cross-over design would of course be highly desirable and, if feasible, would be able to provide more robust data than a two-group design. However, we have strong doubts about the feasibility of a cross-over design. Not only does the determination of the length of the washout period to avoid carry-over effects of metabolic changes pose a difficulty, but also the assumption that those participants who start with the TTE intervention will consciously or unconsciously pay attention to adherence to certain eating times in the next phase, when they are asked to eat at times like before the study.

      In a certain way, however, our study fulfills at least one arm of the cross-over design. During the follow-up period of our study, there were some participants who, by their own admission, started eating at more irregular times again, which is comparable to the mock treatment of the control subjects. And these participants gained weight again.

      (2)  A second weakness is not considering the different biological variables and racial and ethnic diversity and how that might impact outcomes. In sum, the authors have achieved the aims of the study, which will likely help move the field forward.

      In the meantime, we have at least added analyses regarding the age and gender of the participants and found no correlations with weight loss. The sample size of this pilot study was too small for a reliable analysis of the influence of ethnic diversity. If the study is continued with a larger sample size, this type of analysis will certainly come into play.

      We are pleased with the assessment that we have achieved our goals and are helping to advance the field.

      Reviewer #2 (Public Review):

      Summary:

      The authors investigated the effects of the timing of dietary occasions on weight loss and well-being with the aim of explaining if a consistent, timely alignment of dietary occasions throughout the days of the week could improve weight management and overall well-being. The authors attributed these outcomes to a timely alignment of dietary occasions with the body's own circadian rhythms. However, the only evidence the authors provided for this hypothesis is the assumption that the individual timing of dietary occasions of the study participants identified before the intervention reflects the body's own circadian rhythms. This concept is rooted in understanding of dietary cues as a zeitgeber for the circadian system, potentially leading to more efficient energy use and weight management. Furthermore, the primary outcome, body weight loss, was self-reported by the study participants.

      Strengths:

      The innovative focus of the study on the timing of dietary occasions rather than daily energy intake or diet composition presents a fresh perspective in dietary intervention research. The feasibility of the diet plan, developed based on individual profiles of the timing of dietary occasions identified before the intervention, marks a significant step towards personalised nutrition.

      We thank the reviewer for the generally positive assessment of our study and for sharing the view that our personalized approach represents an innovative step in chrononutrion.

      Weaknesses:

      (1) Several methodological issues detract from the study's credibility, including unclear definitions not widely recognized in nutrition or dietetics (e.g., "caloric event"), lack of comprehensive data on body composition, and potential confounders not accounted for (e.g., age range, menstrual cycle, shift work, unmatched cohorts, inclusion of individuals with normal weight, overweight, and obesity).

      We have replaced the term "caloric event" with "calorie intake occasion" and otherwise revised our manuscript with regard to other terminology in order to avoid ambiguity.

      We agree with the reviewer that the determination of body composition is a very important parameter to be investigated. Such investigations will definitely be part of the future continuation of the study. In this pilot study, we aimed to clarify in principle whether our intervention approach shows effects. Since we believe that this is certainly the case, we would like to address the question of what exactly the physiological mechanisms are that explain the observed weight loss in the future.

      Part of these future studies will also include other parameters in the analyses. However, in response to the reviewer's suggestions, we have already completed analyses regarding age and gender of the participants, which show that both variables have no influence on weight loss.

      In our view, the menstrual cycle should not have a major influence on the effectiveness of a 6-week intervention.

      The inclusion of shift workers is not a problem from our point of view. If their work shifts allow them to follow their personal eating schedule, we see no violation of our hypothesis. If this is not the case, as our data in Fig. 1G show, we do not expect any weight loss. Nevertheless, the reviewer is of course right that shift work can generally be a confounding factor and have an influence on weight loss success. To our knowledge, none of the 100 participants evaluated were shift workers. In a continuation of the study, however, shift work should be an exclusion criterion. Yet, our intervention approach could be of great interest for shift workers in particular, as they may be at a particularly high risk of obesity due to irregular eating times. A separate study with shift workers alone could therefore be of particular interest.

      The fact that it turned out that the baseline BMI of the remaining 67 EG and 33 CG participants did not match is discussed in detail in the section "3.1 Limitations". Although this is a limitation, it does not raise much doubt about the effectiveness of the intervention, as a subgroup analysis shows that intervention subjects lose more weight than control subjects of the same BMI.

      The inclusion of a wide BMI range was intentional. Our hypothesis is that reduced temporal variability in eating times optimizes metabolism and therefore excess body weight is lost (which we would like to investigate specifically in future studies). We hypothesize that people living with a high BMI will experience greater optimization than people with a lower BMI. Our data in Figs. 1H and S2I suggest that this assumption is correct.

      (2) The primary outcome's reliance on self-reported body weight and subsequent measurement biases further undermines the reliability of the findings.

      Self-reported data is always more prone to errors than objectively measured data. With regard to the collection of body weight, we were severely restricted in terms of direct contact with the participants during the conduct of the study due to the Covid-19 pandemic. At least the measurement of the initial body weight (at T0), the body weight after the end of the exploration phase (at T1) and the final body weight (at T2) were measured in video calls in the (virtual) presence of the study staff. These are the measurement points that were decisive for our analyses. Intermediate self-reported measurement points were not considered for analyses. We have added in the Materials & Methods section that video calls were undertaken to minimize the risk of misreporting.

      (3) Additionally, the absence of registration in clinical trial registries, such as the EU Clinical Trials Register or clinicaltrials.gov, and the multiple testing of hypotheses which were not listed a priori in the research protocol published on the German Register of Clinical Trials impede the study's transparency and reproducibility.

      Our study was registered in the DRKS - German Clinical Trials Register in accordance with international requirements. The DRKS fulfills the same important criteria as the EU Clinical Trial Register and clinicaltrials.gov.

      We quote from the homepage of the DRKS: „The DRKS is the approved WHO Primary Register in Germany and thus meets the requirements of the International Committee of Medical Journal Editors (ICMJE). […] The WHO brings together the worldwide activities for the registration of clinical trials on the International Clinical Trials Registry Platform (ICTRP). […] As a Primary Register, the DRKS is a member of the ICTRP network.”

      We are therefore convinced that we registered our study in the correct place.

      Furthermore, in our view, we did not provide less information on planned analyses than is usual and all our analyses were covered by the information in the study registry. We have stated the hypothesis in the study register that „strict adherence to [personalized] mealtimes will lead to a strengthening of the circadian system in the digestive tract and thus to an optimization of the utilization of nutrients and ultimately to the adjustment of body weight to an individual ideal value.“

      In our view, numerous analyses are necessary to test this hypothesis. We investigated whether it is the adherence to eating times that is related to the observed weight loss (Fig. 1), or possibly other variables resulting from adherence to the meal schedule (Fig. 3). In addition, we analyzed whether the intervention optimized the utilization of nutrients, which we did based on the food composition and number of calories during the exploration and intervention phases (Fig. 2). We investigated whether the personalization of meal schedules plays a role (Fig. 3). And we attempted to analyze whether the adjustment of body weight to an individual ideal value occurs by correlating the influence of the original BMI with weight loss. Only the hypothesis that the circadian system in the digestive tract is strengthened has not yet been directly investigated, a fact that is listed as a limitation. Although it can be assumed that this has happened, as the Zeitgeber “food” has lost significant variability as a result of the intervention. The analyses on general well-being are covered in the study protocol by the listing of secondary endpoints.

      Beyond that, we did not analyze any hypotheses that were not formulated a priori.

      For these reasons, we see no restriction in transparency, reproducibility or requirements and regulations.

      Achievement of Objectives and Support for Conclusions:

      (4) The study's objectives were partially met; however, the interpretation of the effects of meal timing on weight loss is compromised by the weaknesses mentioned above. The evidence only partially supports some of the claims due to methodological flaws and unstructured data analysis.

      We hope that we have been able to dispel uncertainties regarding some interpretations through supplementary analyses and the addition of some methodological details.

      Impact and Utility:

      (5) Despite its innovative approach, significant methodological and analytical shortcomings limit the study's utility. If these issues were addressed, the research could have meaningful implications for dietary interventions and metabolic research. The concept of timing of dietary occasions in sync with circadian rhythms holds promise but requires further rigorous investigation.

      We are pleased with the assessment that our data to date is promising. We hope that the revised version will already clarify some of the doubts about the data available so far. Furthermore, we absolutely agree with the reviewer: the present study serves to verify whether our intervention approach is potentially effective for weight loss - which we believe is the case. In the next steps, we plan to include extensive metabolic studies and to adjust the limitations of the present study.

      Reviewer #3 (Public Review):

      The authors tested a dietary intervention focused on improving meal regularity in this interesting paper. The study, a two-group, single-center, randomized, controlled, single-blind trial, utilized a smartphone application to track participants' meal frequencies and instructed the experimental group to confine their eating to these times for six weeks. The authors concluded that improving meal regularity reduced excess body weight despite food intake not being altered and contributed to overall improvements in well-being.

      The concept is interesting, but the need for more rigor is of concern.

      We would like to thank the reviewer for the interest in our study.

      (1) A notable limitation is the reliance on self-reported food intake, with the primary outcome being self-reported body weight/BMI, indicating an average weight loss of 2.62 kg. Despite no observed change in caloric intake, the authors assert weight loss among participants.

      As already described above in the responses to the reviewer 2, the body weight assessment took place in video calls in the (virtual) presence of study staff, so that the risk of misreporting is minimized. We have added this information to the manuscript.

      When recording food intake, we had to weigh up the risk of misreporting against the risk of a lack of validity in a permanently monitored setting. It was important to us to investigate the effectiveness of the intervention in the participants' everyday environment and not in a laboratory setting in order to be able to convincingly demonstrate its applicability in everyday life. The restriction of self-reporting is therefore unavoidable in our view and must be accepted. It can possibly be reduced by photographing the food, but even this is not a complete protection against underreporting, as there is no guarantee that everything that is ingested is actually photographed.

      However, our analyses show that the reporting behavior of individual participants did not change significantly between the exploration and intervention phases. We do not assume that participants who underreported only did so during the exploration phase (and only ate more than reported in this study phase) and reported correctly in the intervention phase (and then indeed consumed fewer calories).  We discuss this point in the section "3.1 Limitations".

      (2) The trial's reliance on self-reported caloric intake is problematic, as participants tend to underreport intake; for example, in the NEJM paper (DOI: 10.1056/NEJM199212313272701), some participants underreported caloric intake by approximately 50%, rendering such data unreliable and hence misleading. More rigorous methods for assessing food intake are available and should have been utilized. Merely acknowledging the unreliability of self-reported caloric intake is insufficient as it would still leave the reader with the impression that there is no change in food intake when we actually have no idea if food intake was altered. A more robust approach to assessing food intake is imperative. Even if a decrease in caloric intake is observed through rigorous measurement, as I am convinced a more rigorous study would unveil testing this paradigm, this intervention may merely represent another short-term diet among countless others that show that one may lose weight by going on a diet, principally due to heightened dietary awareness.

      The risks of self-reporting, our considerations, and our analysis of participants' reporting behavior and caloric intake over the course of the study are discussed in detail both in our responses above and in the manuscript. 

      With regard to the reviewer's second argument, we have largely adapted the study protocol of the control group to that of the experimental group. Apart from the fact that the control subjects were not given guidelines on eating times and were instead only given a very rough time window of 18 hours for food intake, the content of the sessions and the measurement methods were the same in both groups. This means that the possibility of increased nutritional awareness was equally present in both groups, but only the participants in the experimental group lost a significant amount of body weight.

      In future continuations of the study, further follow-up after an even longer period than four weeks (e.g. after 6 months) can be included in the protocol in order to examine whether the effects can be sustained over a longer period.

      (3) Furthermore, the assessment of circadian rhythm using the MCTQ, a self-reported measure of chronotype, may not be as reliable as more objective methods like actigraphy.

      The MCTQ is a validated means of determining chronotype and its results are significantly associated with the results of actigraphic measurements. In our view, the MCTQ is sufficient to test our hypothesis that matching the chronobiological characteristics of participants is beneficial. Nevertheless, measurements using actigraphy could be of interest, for example to correlate the success of weight loss with parameters of the sleep-wake rhythm.

      (4) Given the potential limitations associated with self-reported data in both dietary intake and circadian rhythm assessment, the overall impact of this manuscript is low. Increasing rigor by incorporating more objective and reliable measurement techniques in future studies could strengthen the validity and impact of the findings.

      The body weight data was not self-reported, but the measurements were taken in the presence of study staff. Although optimization might be possible (see above), we do not currently see any other way of recording all calorie intake occasions in the natural environment of the participants over a period of several weeks (or possibly longer, as noted by the reviewer) other than self-report and, in our opinion, it would not be feasible. For the future continuation of the study, we are planning occasional indirect calorimetry measurements that can provide information about the actual amount of food consumed in different phases of the study. These can reveal errors in the self-report but will not be able to replace daily data collection by means of self-report.

      Reviewer #1 (Recommendations For The Authors):

      Summary:

      This interesting and timely study by Wilming and colleagues examines the effect of regularity vs. irregularity of feeding on body weight dynamics and BMI. A rigorous assessment of the same in humans needs to be improved, which this study provides. The study is well-designed, with a 14-day exploration phase followed by 6 weeks of intervention, and it is commendable to see the number of participants (100) who completed the study. Incorporation of a follow-up assessment 4 weeks after the conclusion of the study shows maintained weight loss in a subset of Experimental Group (EG) participants who continue with regular meals. There are several key observations, including particular meal times (lunch and dinner), which, when restricted to 45min or less in duration (MTVS of 3 or less), will lead to efficient weight loss, as well as correlations between baseline BMI and weight loss. The authors also exclude the impact of self-reported meal composition on the efficiency of weight loss in the EG group in the context of this study. The study reports interesting effects of regularity of feeding on eating behavior, which appears to be independent of weight loss. Finally, the authors highlight an important point: to provide attention to personalized feeding and circadian windows and that personalized interventions that cater to individual circadian structures will result in more significant weight loss. This is an important concept that needs to be brought to light. There are only a few minor comments listed below:

      Minor comments:

      (1) The authors may provide explanations for the reduction in the MTVS in the EG and the increase in the same for the Control Group (CG). The increases in MTVS in CG are surprising (lines 105-106) because it is assumed that there is no difference in CG eating patterns prior to and during the study.

      As the reviewer correctly states, our assumption was that there should be no change in the MTVS before and during the study - but we could not rule this out, as the subjects were not given any indication of the regularity of food intake in the fixed time window in the meetings with the study staff, i.e. they were not instructed to continue eating exactly as before. This would possibly have led to an effort on the part of the participants to adhere to a schedule as precisely as possible. As a result, there was a statistically significant worsening of the MTVS in the CG, which was less than 0.6 MTVS, i.e. a time span of only approx. ± 7.5 min, and remained within the MTVS 3. Since there were no correlations between the measured MTVS and the weight of the subjects in the CG and a change of about half an MTVS value has only a rather minor effect on weight, we do not attribute great significance to the observed deterioration in the MTVS.

      (2) There would be greater clarity for the readers if the authors clearly defined the study design in detail at the outset of the study, e.g., in section 2.1.

      We have included a brief summary of the study design at the end of the introduction so that the reader is already familiar with it at the beginning of the manuscript without having to switch to the material and methods section.

      (3) The data in Fig S2H is important and informs readers that the regularity of lunch and dinner is more related to body weight changes than breakfast. These data should be incorporated in the Main Figure. In addition, analyses of Table S7 data indicate that MTVS of no greater than 3 or -/+45mins of the meal-timing window is associated with efficient weight loss) should be represented in a figure panel in the Main Figures.

      As suggested by the reviewer, we have moved Fig. S2H to the main Fig. 1. In addition, Table S7 is now no longer inserted as a supplementary table but as main Table 1 in the manuscript.

      (4) The authors state in lines 222-223 that "weight changes of participants were not related to one of these changes in eating characteristics (Fig. 3B-D, Tab. S6)", referring to the shortening of feeding windows as noted in the EG group. This is a rather simplistic statement, which should be amended to include that weight changes may not relate to changes in eating characteristics per se but likely relate to changes in metabolic programming, for instance, energy expenditure increases, which have been shown to associate with these changes in eating characteristics. This is important to note.

      We have changed the wording at this point so that it is clear that we are only referring here in the results section to the results of the mathematical analysis, which showed no correlation between the eating time window and weight loss in our sample. However, we have now explicitly mentioned the change in metabolic programming correctly noted by the reviewer in the discussion at the end of section 3.

      (5) Please provide more background and details on the attributes that define individual participant chronotypes in the manuscript before discussing datasets, e.g., mSP and mEP. This is relation to narratives between 228-230: "Indeed, our data show that the later the chronotype of participants (measured by the MCTQ mid-sleep phase, mSP [24]), the later their mid-eat phase (mEP) on weekends (Fig. 3E, Tab. S6), with the mSP and mEP being almost antiphasic on average (Fig. 3F, Tab. S10)." This will help readers unfamiliar with circadian biology/chronobiology research understand the contents of this manuscript, particularly Fig 3.

      We have explained the new chronobiology terms that appear in the chapter better in the revised version so that they are easier to understand.

      Reviewer #2 (Recommendations For The Authors):

      (1) Clarify Terminology: Define or avoid using ambiguous terms such as "caloric event" to prevent confusion, especially for readers less familiar with chronobiology. Consider providing clear explanations or opting for more widely understood terms.

      We have replaced "caloric event" with “calorie intake occasion” and explain various chronobiology terms better, so that hopefully readers from other disciplines can now follow the text more easily.

      (2) Detailed Methodological Descriptions: Improve the transparency of your methods, especially concerning the measurement of primary and secondary outcomes. Address the concerns raised about the reliability of self-reported weight and the potential biases in measurement methods.

      In the section "3.1 Limitations", we have examined the aspect of the reliability of self-reported data and our measures to reduce this uncertainty in more detail. We have also added further details on the measurement of outcomes in the materials and methods section.

      (3) Address Participant Selection Criteria: Reevaluate the inclusion criteria and consider discussing the implications on the study's findings of the broad age range, the inclusion of shift work, unmatched cohorts, and inclusion of individuals with normal weight, overweight, and obesity. Provide a subgroup analysis or discuss how BMI might have influenced the results. Even though this is an additional post-hoc analysis, it would directly address one of the major weaknesses of the study design.

      We have supplemented the analyses and now show in Fig. S2G that neither age nor gender had any influence on weight loss as a result of the intervention. To our knowledge, none of the 100 participants evaluated were shift workers. Even if shift workers were part of the study without our knowledge, we do not consider this to be a problem as long as their shifts allow them to keep to certain eating times. The fact that it turned out that the baseline BMI of the remaining 67 EG and 33 CG participants did not match is discussed in detail in the section "3.1 Limitations". Our previous analysis in Fig. S2I already showed that there is a negative correlation between baseline BMI and weight loss - an interesting result, as it shows that people with a high BMI particularly benefit from the intervention. In addition, we already showed in Fig. S2J in a subgroup analysis that in all strata the BMI of EG subjects decreased more than that of CG subjects, even if they had the same initial BMI. We do not consider the wide dispersion of the BMIs of the included participants to be a weakness of the study design. On the contrary, it allows us to make a statement about which target group the intervention is particularly suitable for.

      (4) Improve Statistical Analysis: If not already done, involve a biostatistician to review the statistical analyses, particularly concerning post-hoc tests, correlation analyses, and the handling of measurement biases. Ensure that deviations from the original study protocol are clearly documented and justified.

      All analyses have already been checked by a statistician, decided together with him and approved by him.

      (5) Data Interpretation and Speculation: Limit speculation and clearly distinguish between findings supported by your data from hypotheses and future directions. Ensure that discussions about the implications of meal timing on metabolism are supported by evidence with adequate references and clearly state where further research is needed.

      We have revised the discussion and, especially through the detailed discussions of the limitations, we have emphasized more clearly what has been achieved and what still needs to be proven in future studies.

      (6) Clinical Trial Registration: Address the lack of registration in the EU Clinical Trials Register and clinicaltrials.gov. Discuss its potential implications on the study's transparency and how it aligns with current requirements and regulations.

      Our study was registered in the DRKS - German Clinical Trials Register in accordance with international requirements. The DRKS fulfills the same important criteria as the EU Clinical Trial Register and clinicaltrials.gov.

      We quote from the homepage of the DRKS: „The DRKS is the approved WHO Primary Register in Germany and thus meets the requirements of the International Committee of Medical Journal Editors (ICMJE).[…] The WHO brings together the worldwide activities for the registration of clinical trials on the International Clinical Trials Registry Platform (ICTRP). […] As a Primary Register, the DRKS is a member of the ICTRP network.”

      We are therefore convinced that we registered our study in the correct place before it began and see no restriction in transparency or requirements and regulations.

      (7) Use of Sensitive and Current Terminology: Update the manuscript to reflect the latest recommendations regarding the language used to describe obesity and patients living with obesity. This ensures respect and accuracy in reporting and aligns with contemporary standards in the field.

      We updated the manuscript accordingly.

      (8) Strengthen the Introduction: Expand the literature review to include more recent and relevant studies that contextualise your work within the broader field of chrononutrition. This could help clarify how your study builds upon or diverges from existing research.

      We have included further studies in the introduction that aim to reduce body weight by restricting food intake to certain time periods. We have also more clearly contrasted the designs of these studies with the design of our study.

      (9) Clarify Discrepancies and Errors: Address any inconsistencies, such as the discrepancy in meal timing instructions (90 minutes reported in the conclusion vs. 60 minutes reported in the methods), and ensure all figures, tables, and statistical analyses are correctly referenced and described.

      The first point mentioned by the reviewer is not an inconsistency. To ensure the feasibility of the intervention, each participant was initially given a time window of +/- 30 minutes (60 min) from the specified eating time. Our later analyses show that even a time window of +/- 45 minutes (90 min) around the specified eating time is sufficient to lose weight efficiently (see results in Table 1).

      We have checked all references to figures, tables and statistical analyses and updated them if necessary.

      (10) Discuss Limitations and Bias: More thoroughly discuss the limitations of your study, including the potential impacts of biases and how they were mitigated. Additionally, consider the effects of including shift workers and how this choice impacts the applicability of your findings.

      Section “3.1 Limitations” has now been supplemented by a number of points and discussions. As described above, we do not consider the inclusion of shift workers to be a limitation as long as they are able to adhere to the specifications of the eating time plan. We cannot derive any indications to the contrary from our data.

      (11) Consider Publishing Separate Manuscripts: If the study encompasses a wide range of outcomes or post-hoc analyses, consider separating these into distinct publications to allow for a more focused and detailed exploration of each set of findings.

      We will take this advice into consideration for future publications on the continuation of the study. As this is a pilot study that is intended to clarify whether and to what extent the intervention is effective, we believe it makes sense to report all the data in a publication.

      (12) By addressing these recommendations, the authors can significantly improve their manuscript's clarity, reliability, and impact. This would not only support the dissemination of their findings but also would contribute valuable insights into the growing field of chrononutrition.

      We hope that we have satisfactorily answered, discussed and implemented the points mentioned by the reviewer in the manuscript, so that clarity, reliability, and impact have been increased and it can offer a valuable contribution to the named field.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The report describes the control of the activity of the RNA-activated protein kinase, PKR, by the Vaccinia virus K3 protein. Repressive binding of K3 to the kinase prevents phosphorylation of its recognised substrate, EIF2α (the α subunit of the Eukaryotic Initiation Factor 2). The interaction of K3 is probed by saturation mutation within four regions of PKR chosen by modelling the molecules' interaction. They identify K3-resistant PKR variants that recognise that the K3/EIF2α-binding surface of the kinase is malleable. This is reasonably interpreted as indicating the potential adaptability of this antiviral protein to combat viral virulence factors.

      Strengths:

      This is a well-conducted study that probes the versatility of the antiviral response to escape a viral inhibitor. The experimentation is very diligent, generating and screening a large number of variants to recognise the malleability of residues at the interface between PKR and K3.

      Weaknesses:

      (1) These are minor. The protein interaction between PKR and K3 has been previously well-explored through phylogenetic and functional analyses and molecular dynamics studies, as well as with more limited site-directed mutational studies using the same experimental assays.

      Accordingly, these findings largely reinforce what had been established rather than making major discoveries.

      First, thank you for your thoughtful feedback. We agree that our results are concordant with previous findings and recognize the importance of emphasizing what we find novel in our results. We have revised the introduction (lines 65-74 of the revised_manuscript.pdf) to emphasize three findings of interest: (1) the PKR kinase domain is largely pliable across its substrate-binding interface, a remarkable quality that is most fully revealed through a comprehensive screen, (2) we were able to differentiate variants that render PKR nonfunctional from those that are susceptible to Vaccinia K3, and (3) we observe a strong correlation between PKR variants that are resistant to K3 WT and K3-H47R.

      There are some presumptions:

      (2) It isn't established that the different PKR constructs are expressed equivalently so there is the contingency that this could account for some of the functional differences.

      This is an excellent point. We have revised the manuscript to raise this caveat in the discussion (lines 247-251). One indirect reason to suppose that expression differences among our PKR variants are not a dominant source of variation is that we did not observe much variation in kinase activity in the absence of K3.

      (3) Details about the confirmation of PKR used to model the interaction aren't given so it isn't clear how accurately the model captures the active kinase state. This is important for the interaction with K3/EIF2α.

      We have expanded on Supplemental Figure 12 and our description of the AlphaFold2 models in the Materials and Methods section (lines 573-590). We clarify that these models may not accurately capture the phosphoacceptor loop of eIF2α (residues Glu49-Lys60) and the PKR β4-5 linker (Asp338-Asn350) as these are highly flexible regions that are absent in the existing crystal structure complex (PDB 2A1A) and have low AlphaFold2 confidence scores (pLDDT < 50). We also noted, in the Materials and Methods section and in the caption of Figure 1, that the modeled eIF2α closely resembles the crystal structure of standalone yeast eIF2α, which places the Ser51 phosphoacceptor site far from the PKR active site. Thus, we expect there are additional undetermined PKR residues that contact eIF2α.

      (4) Not all regions identified to form the interface between PKR and K3 were assessed in the experimentation. It isn't clear why residues between positions 332-358 weren't examined, particularly as this would have made this report more complete than preceding studies of this protein interaction.

      Great questions. We designed and generated the PKR variant library based on the vaccinia K3 crystal structure (PDB 1LUZ) aligned to eIF2α in complex with PKR (PDB 2A1A), in which PKR residues 338-350 are absent. After the genesis of the project, we generated the AlphaFold2-predicted complex of PKR and vaccinia K3, and have become very interested in the β4-β5 linker, a highly diverse region across PKR homologs which includes residues 332-358. However, this region remains unexamined in this manuscript.

      Reviewer #2 (Public Review):

      Chambers et al. (2024) present a systematic and unbiased approach to explore the evolutionary potential of the human antiviral protein kinase R (PKR) to evade inhibition by a poxviral antagonist while maintaining one of its essential functions.

      The authors generated a library of 426 single-nucleotide polymorphism (SNP)-accessible non-synonymous variants of PKR kinase domain and used a yeast-based heterologous virus-host system to assess PKR variants' ability to escape antagonism by the vaccinia virus pseudo-substrate inhibitor K3. The study identified determinant sites in the PKR kinase domain that harbor K3-resistant variants, as well as sites where variation leads to PKR loss of function. The authors found that multiple K3-resistant variants are readily available throughout the domain interface and are enriched at sites under positive selection. They further found some evidence of PKR resilience to viral antagonist diversification. These findings highlight the remarkable adaptability of PKR in response to viral antagonism by mimicry.

      Significance of the findings:

      The findings are important with implications for various fields, including evolutionary biology, virus-host interfaces, genetic conflicts, and antiviral immunity.

      Strength of the evidence:

      Convincing methodology using state-of-the-art mutational scanning approach in an elegant and simple setup to address important challenges in virus-host molecular conflicts and protein adaptations.

      Strengths:

      Systematic and Unbiased Approach:

      The study's comprehensive approach to generating and characterizing a large library of PKR variants provides valuable insights into the evolutionary landscape of the PKR kinase domain. By focusing on SNP-accessible variants, the authors ensure the relevance of their findings to naturally occurring mutations.

      Identification of Key Sites:

      The identification of specific sites in the PKR kinase domain that confer resistance or susceptibility to a poxvirus pseudosubstrate inhibition is a significant contribution.

      Evolutionary Implications:

      The authors performed meticulous comparative analyses throughout the study between the functional variants from their mutagenesis screen ("prospective") and the evolutionarily-relevant past adaptations ("retrospective").

      Experimental Design:

      The use of a yeast-based assay to simultaneously assess PKR capacity to induce cell growth arrest and susceptibility/resistance to various VACV K3 alleles is an efficient approach. The combination of this assay with high-throughput sequencing allows for the rapid characterization of a large number of PKR variants.

      Areas for Improvement:

      (5) Validation of the screen: The results would be strengthened by validating results from the screen on a handful of candidate PKR variants, either using a similar yeast heterologous assay, or - even more powerfully - in another experimental system assaying for similar function (cell translation arrest) or protein-protein interaction.

      Thank you for your thoughtful feedback. We agree that additional data to validate our findings would strengthen the manuscript. We have individually screened a handful of PKR variants in duplicate using serial dilution to measure yeast growth, and found that the results generally support our original findings. We have revised the manuscript to include these validation experiments (lines 117-119 of the revised_manuscript.pdf, Supplemental Figure 4).

      (6) Evolutionary Data: Beyond residues under positive selection, the screen would allow the authors to also perform a comparative analysis with PKR residues under purifying selection. Because they are assessing one of the most conserved ancestral functions of PKR (i.e. cell translation arrest), it may also be of interest to discuss these highly conserved sites.

      This is a great point. We do find that there are regions of the PKR kinase domain that are not amenable to genetic perturbation, namely in the glycine rich loop and active site. We contrast the PKR functional scores at conserved residues under purifying selection with those under positive selection in Figure 2E (lines 141-143).

      (7) Mechanistic Insights: While the study identifies key sites and residues involved in vaccinia K3 resistance, it could benefit from further investigation into the underlying molecular mechanisms. The study's reliance on a single experimental approach, deep mutational scanning, may introduce biases and limit the scope of the findings. The authors may acknowledge these limitations in the Discussion.

      We agree that further investigation into the underlying molecular mechanisms is warranted and we have revised the manuscript to acknowledge this point in the discussion (lines 284-288).

      (8) Viral Diversity: The study focuses on the viral inhibitor K3 from vaccinia. Expanding the analysis to include other viral inhibitors, or exploring the effects of PKR variants on a range of viruses would strengthen and expand the study's conclusions. Would the identified VACV K3-resistant variants also be effective against other viral inhibitors (from pox or other viruses)? or in the context of infection with different viruses? Without such evidence, the authors may check the manuscript is specific about the conclusions.

      This is a fantastic question that we are interested in exploring in our future studies. In the manuscript we note a strong correlation between PKR variants that evade vaccinia wild-type K3 and the K3-H47R enhanced allele, but we are curious to know if this holds when tested against other K3 orthologs such as variola virus C3. That said, we have revised the manuscript to clarify this limitation to our findings and specify vaccinia K3 where appropriate.

      Reviewer #3 (Public Review):

      Summary:

      -  This study investigated how genetic variation in the human protein PKR can enable sensitivity or resistance to a viral inhibitor from the vaccinia virus called K3.

      -  The authors generated a collection of PKR mutants and characterized their activity in a high-throughput yeast assay to identify 1) which mutations alter PKR's intrinsic biochemical activity, 2) which mutations allow for PKR to escape from viral K3, and 3) which mutations allow for escape from a mutant version of K3 that was previously known to inhibit PKR more efficiently.

      -  As a result of this work, the authors generated a detailed map of residues at the PKR-K3 binding surface and the functional impacts of single mutation changes at these sites.

      Strengths:

      -  Experiments assessed each PKR variant against three different alleles of the K3 antagonist, allowing for a combinatorial view of how each PKR mutant performs in different settings.

      -  Nice development of a useful, high-throughput yeast assay to assess PKR activity, with highly detailed methods to facilitate open science and reproducibility.

      -  The authors generated a very clean, high-quality, and well-replicated dataset.

      Weaknesses:

      (9) The authors chose to focus solely on testing residues in or near the PKR-K3 predicted binding interface. As a result, there was only a moderately complex library of PKR mutants tested. The residues selected for investigation were logical, but this limited the potential for observing allosteric interactions or other less-expected results.

      First, we greatly appreciate all your feedback on the manuscript, as well as raising this particular point. We agree that this is a moderately complex library of PKR variants, from which we begin to uncover a highly pliable domain with a few specific sites that cannot be altered. We have revised the manuscript to raise this limitation (lines 284-288 of the revised_manuscript.pdf) and encourage additional exploration of the PKR kinase domain.

      (10) For residues of interest, some kind of independent validation assay would have been useful to demonstrate that this yeast fitness-based assay is a reliable and quantitative readout of PKR activity.

      We agree that additional data to validate our findings would strengthen the manuscript. We have individually screened a handful of PKR variants in duplicate using serial dilution to measure yeast growth, and generally found that the results support our original findings. We have revised the manuscript to include this validation experiment (lines 117-119, Supplemental Figure 4).

      (11) As written, the current version of the manuscript could use more context to help a general reader understand 1) what was previously known about these PKR and K3 variants, 2) what was known about how other genes involved in arms races evolve, or 3) what predictions or goals the authors had at the beginning of their experiment. As a result, this paper mostly provides a detailed catalog of variants and their effects. This will be a useful reference for those carrying out detailed, biochemical studies of PKR or K3, but any broader lessons are limited.

      Thank you for bringing this to our attention. We have revised the introduction of the manuscript to provide more context regarding previous work demonstrating an evolutionary arms race between PKR and K3 and how single residue changes alter K3 resistance (lines 51-64).

      (12) I felt there was a missed opportunity to connect the study's findings to outside evolutionary genetic information, beyond asking if there was overlap with PKR sites that a single previous study had identified as positively selected. For example, are there any signals of balancing selection for PKR? How much allelic diversity is there within humans, and are people typically heterozygous for PKR variants? Relatedly, although PKR variants were tested in isolation here, would the authors expect their functional impacts to be recessive or dominant, and would this alter their interpretations? On the viral diversity side, how much variation is there among K3 sequences? Is there an elevated evolutionary rate, for example, in K3 at residues that contact PKR sites that can confer resistance? None of these additions are essential, but some kind of discussion or analysis like this would help to connect the yeast-based PKR phenotypic assay presented here back to the real-world context for these genes.

      We appreciate this suggestion to extend our findings to a broader evolutionary context. There is little allelic diversity of PKR in humans, with all nonsynonymous variation listed in gnomAD being rare. (PKR shows sequence diversity in comparisons across species, including across primates.) Thus, barring the possibility of variation being present in under-studied populations, there is unlikely to be balancing selection on PKR in humans. Our expectation is that beneficial mutations in PKR for evading a pseudosubstrate inhibitor would be dominant, as a small amount of eIF2α phosphorylation is capable of halting translation (Siekierka, PNAS, 1984). There is a recent report citing PKR missense variants associated with dystonia that can be dominantly or recessively inherited (Eemy et al. 2020 PMID 33236446). Elde et al. 2009 (PMID 19043403) notes that poxvirus K3 homologs are under positive selection but no specific residues have been cited to be under positive selection. The lack of allelic diversity in PKR in humans notwithstanding, PKR could experience future selection in the human population as evidenced by its rapid evolution in primates, so we fully agree that a connection to the real-world context is useful. We have noted these topics in the discussion section (lines 289-294).

      Reviewer #1 (Recommendations For The Authors):

      I have no major criticisms but ask for some clarifications and make some comments about the perceived weaknesses.

      (13)  If the authors disagree with my summation that the findings largely replicate what was known, could they detail how the findings differ from what was known about this protein interaction and the major new insights stemming from the study? Currently, the abstract is a little philosophical rather than listing the explicit discoveries of the study.

      Thank you again for raising the need for us to clearly convey the novelty of our findings. We have revised the final paragraph in our introduction as described in comment #1.

      (14) As the experimental approach is well reported it is unnecessary to confirm the proposed activity by, for instance, measures of Sui2 phosphorylation. However, previous reports have recognised that point mutants of PKR can be differentially expressed. The impact of this potential effect is unknown in the current experimentation as there are no measures of the expression of the different mutant PKR constructs. The large number of constructs used makes this verification onerous. The potential impact could be ameliorated by redundant replacing each residue (hoping different residues have different effects on expression). Still, this limitation of the study should be acknowledged in the text.

      We greatly appreciate this comment and agree that this should be made clear in the text, which we have added to the discussion of the manuscript (lines 247-251).

      (15) Preceding findings and the modeling in this report recognise an involvement in the kinase insert region (residues 332 to 358) in PKR's interaction with K3 but this region is excluded from the analysis. These residues have been largely disregarded in the preceding analysis (it is absent from the molecular structure of the kinase) so its inclusion here might have lent a more novel aspect or delivered a more complete investigation. Is there a justification for excluding this flexible loop?

      The PKR variant library was designed based on the crystal structure of K3 (PDB 1LUZ) aligned to eIF2α in complex with PKR (PDB 2A1A). After the library was designed and made we attained complete predicted structures of PKR in complex with eIF2α and K3, which largely agrees with the predicted crystal structures but contain the additional flexible loops that were not captured in the crystal structures. Though the library studied here does not explore variation in the kinase insert region, we are very interested in doing so in our future studies.

      (16)  Could the explanation of the 'PKR functional score' be clarified? The description given within the legend of SF1 was helpful, so could this be replicated earlier in the main body of the text when introducing these experiments? e.g. As PKR activity is toxic to yeast, the number of cells in the pool expressing the functional PKR will decrease over time. Thus the associated barcode read count will also decrease, while the read count for the nonfunctional PKR will increase. This is termed the PKR function score, which will be relatively lower for cells transformed with less active PKR than those with more active PKR.

      Thank you for suggesting this clarification, we have revised the manuscript to clarify our definition of the PKR functional score (lines 106-109).

      (17)  Another suggestion to clarify this term is to modify the figures. Currently, the intent of the first simulated graph in Fig 1E is clear but the inversion of the response (shown by the transposition of the colours) in the next graph (to the right) is less immediately obvious. Accordingly, the orientation of the 'PKR functional score' is uncertain. Could the authors add text to the rightmost graphic in Figure 1E by, for instance, indicating the PKR activity in the vertical column with text such as 'less active' (at the bottom), 'WT' (in the centre), and 'more activity' (at the top)? Also, the position of the inactive K296R mutant might be added to Figure 2A complementing the positioning of the active WT kinase in the first data graph of this kind.

      We appreciate your specific feedback to improve the figures of the manuscript, we have made adjustments to Figure 1E to clarify how we derive the PKR functional scores.

      (18) The authors don't use existing structures of PKR in their modelling. However, there is no information about the state of the PKR molecule used for modelling. Specific elements of the kinase domain affect its interaction with K3 so it would be informative to know the orientation of these elements in the model. Could the authors detail the state of pivotal kinase elements in their models? This could involve the alignment of the N- and C-lobes, the orientation of kinase spines (C- and R-spines), and the phosphorylation stasis of residues in the activation loop, or at least the position of this loop in relationship to that adopted in the active dimeric kinase (e.g. PDB-2A1A, 3UIU or 6D3L). Alternatively, crystallographic structures of active inactive PKR could be overlayed with the theoretical structure used for modelling (as supplementary information).

      We have revised the manuscript to describe the alignment of the predicted PKR-K3 complex with active and inactive PKR, and we have extended Supplemental Figure 12 with an overlay of the predicted structures with existing structures. We have also added a supplemental data file containing the RMSD values of PKR (from the predicted PKR-K3 complex) aligned to active (PDB 2A1A) and inactive (PDB 3UIU) or unphosphorylated (PDB 6D3L) PKR (5_Structure-Alignment-RMSD-Values.xlsx). We have also provided the AlphaFold2 best model predictions for the PKR-eIF2α complex (6_AF2_PKR-KD_eIF2a.pdb) and PKR-K3 complex (7_AF2_PKR-KD_VACV-K3.pdb). Looking across the RMSD values, the AlphaFold2 model of PKR most closely resembles unphosphorylated PKR (PDB 6D3L) though we note the activation loop is absent from PDB 6D3L and 3UIU. We also aligned the Ser51 phosphoacceptor loop of AlphaFold2 eIF2α model to PDB 1Q46 and we see that the model reflects the pre-phosphorylation state. This loop is expected to interact with the PKR active site, which is not captured in our model and we state this explicitly in the caption of Figure 1 (lines 665-668).

      (19) Could some specific residue in Figure 7 be labelled (numbered) to orient the findings? Also, the key in this figure doesn't title the residues coloured white (RE red/black/blue). The white also isn't distinguished from the green (outside the regions targeted for mutagenesis).

      Excellent suggestion, we have revised this figure to include labels for the sites to orient the reader and clarify our categorization of PKR residues in the kinase domain.

      (20)  Regarding the discussion, the authors adopt the convention of describing K3 as a pseudosubstrate. Although I realize it is common to refer to K3 as a pseudosubstrate, it isn't phosphorylated and binds slightly differently to PKR so alternative descriptors, such as 'a competitive binder', would more accurately present the protein's function. Possibly for this reason, the authors declared an expectation that evolution pressures should shift K3 to precisely mimic EIF2α. However, closer molecular mimicry shouldn't be expected for two reasons. The first is a risk of disrupting other interactions, such as the EIF2 complex. Secondly, equivalent binding to PKR would demote K3 to merely a stoichiometric competitor of EIF2α. In this instance, effective inhibition would require very high levels of K3 to compete with equivalent binding by EIF2α. This would be demanding particularly upon induction of PKR during the interferon response. To be an effective inhibitor K3 has to bind more avidly than EIF2α and merely requires a sufficient overlap with the EIF2α interface on PKR to disrupt this alternative association. This interpretation predicts that K3 is under pressure to bind PKR by a different mechanism than EIF2α.

      We appreciate your thoughtful point about the usage of the term pseudosubstrate. Ultimately, we’ve decided to continue using the term due to its historical usage in the field. The question of the optimal extent of mimicry in K3 is a fascinating one, and we greatly appreciate your thoughts. We wholly agree that the possibility of K3 having superior PKR binding relative to eIF2α would be preferable to perfect mimicry. In our Ideas and Speculation section, we propose that benefits towards increasing PKR affinity may need to be balanced against potential loss of host range resulting from overfitting to a given host’s PKR. However, the possibility that reduced mimicry could be selected to avoid disruption of eIF2 function had not occurred to us; thank you for pointing it out!

      (21) The discussion of the 'positive selection' of sites is also interesting in this context. To what extent has the proposed positive selection been quantified? My understanding is that all of the EIF2α kinases are conserved and so demonstrate lower levels of residue change that might be expected by random mutagenesis i.e. variance is under negative selection. The relatively higher rate of variance in PKR orthologs compared to other EIF2α kinases could reflect some relaxation of these constraints, rather than positive selection. Greater tolerance of change may stem from PKR 's more sporadic function in the immune response (infrequent and intermittent presence of its activating stimuli) rather than the ceaseless control of homeostasis by the other EIF2α kinases. Also, induction of PKR during the immune response might compensate for mutations that reduce its activity. I believe that the entire clade of extant poxviruses is young relative to the divergence between their hosts. Accordingly, genetic variance in PKR predates these viruses. Although a change in PKR may become fixed if it affords an advantage during infection, such an advantage to the host would be countered by the much higher mutation rates of the virus. This would appear to diminish the opportunity for a specific mutation to dominate a host population and, thereby, to differentiate host species. Rather, pressure to elude control by a rapidly evolving viral factor would favour variation at sites where K3 binds. This speculation offers an alternative perspective to the current discussion that the variance in PKR orthologs stems from positive selection driven by viral infection.

      We appreciate this stimulating feedback for discussion. Three of the four eIF2α kinases (HIR, PERK, and GCN2) appear to be under purifying selection (Elde et al. 2009, PMID 19043403), which stand in contrast to PKR. Residues under positive selection have been found throughout PKR, including the dsRNA binding domains, linker region, and the kinase domain. Importantly, the selection analysis from Elde et al. and Rothenburg et al. concluded that positive selection at these sites is more likely than relaxed selection. We agree that poxviruses are young, though we would guess that viral pseudosubstrate inhibition of PKR is ancient. Many viral proteins have been reported to directly interact with PKR, including herpes virus US11, influenza A virus NS1A, hepatitis C virus NS5A, and human immunodeficiency virus Tat. The PKR kinase domain does contain residues under purifying selection that are conserved among all four eIF2α kinases, but it also contains residues under positive selection that interface with the natural substrate eIF2α. Our work suggests that PKR is genetically pliable across several sites in the kinase domain, and we are curious to know if this pliability would hold at the same sites across the other three eIF2α kinases.

      (22) The manuscript is very well written but has a small number of typos; e.g. an aberrant 'e' ln 7 of the introduction, capitalise the R in ranavirus on the last line of the fourth paragraph of the discussion, and eIF2α (EIF2α?) is occasionally written as eIFα in the materials&methods.

      Thank you for bringing these typos to our attention! We’ve deleted the aberrant ‘e’ in the introduction, capitalized ‘Ranavirus’ in the discussion (line 265), and corrected ‘eIFα’ to ‘eIF2α’ throughout the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Additional minor edits or revisions:

      (23) Paragraph 3 of the Introduction gives the impression that most of the previous work on the PKR-virus arms race is speculative. However, it is one of the best-described and most convincing examples of virus-host arms races. Can the authors edit the paragraph accordingly?

      Thank you for bringing this to our attention. We have revised the third paragraph and strengthened the description of the evolutionary arms race between PKR and viral pseudosubstrate antagonists.

      (24) Introduction: PKR has "two" double-stranded RNA binding domains. Can the authors update the text accordingly?

      We have updated the manuscript to clarify PKR has two dsRNA binding domains (lines 44-45).

      (25) The authors test here for one of the key functions of PKR: cell growth/translation arrest. Because of PKR pleiotropy, the manuscript may be edited accordingly: For example, statements such as "We found few genetic variants render the PKR kinase domain nonfunctional" are too speculative as they may retain other (not tested here) functions.

      This is a great suggestion, we have revised the manuscript to specify our definition of nonfunction in the context of our experimental screen (lines 86-92 and 106-109) and acknowledge this limitation in our experimental screen (lines 304-307).

      (26) The authors should specify "vaccinia" K3 whenever appropriate.

      We appreciate this comment and have revised the manuscript to specify vaccinia K3 where appropriate (e.g. lines 62,66, 70, 80, 108, and 226).

      (27) Ref for ACE2 diversification may include Frank et al 2022 PMID: 35892217.

      Thank you for pointing us to this paper, we have included it as a reference in the manuscript (line 277).

      (28) Positive selection of PKR as referred to by the authors corresponds to analyses performed in primates. As shown by several studies, the sites under positive selection may vary according to host orders. Can the authors specify this ("primate") in their manuscript? And/or shortly discuss this aspect.

      Thank you for raising this point. In the manuscript we performed our analysis using vertebrate sites under positive selection as identified in Rothenburg et al. 2009 PMID 19043413 (lines 51 and figure legends). We performed the same analysis using sites under positive selection in primates (as identified by Elde et al. 2009 PMID 19043403) and again found a significant difference in PKR functional scores versus K3. We have revised the manuscript to clarify our use of vertebrate sites under positive selection (line 80-81).

      (29) We view deep mutational scanning experiments as a complementary approach to positive selection": The authors should edit this and acknowledge previous and similar work of other antiviral factors, in particular one of the first studies of this kind on MxA (Colon-Thillet et al 2019 PMID: 31574080), and TRIM5 (Tenthorey et al 2020 PMID: 32930662).

      Thank you for raising up these two papers, which we acknowledge in the revised manuscript (line 299).

      (30) We believe Figure S7 brings important results and should be placed in the Main.

      We appreciate this suggestion, and have moved the contents of the former supplementary Figure 7 to the main text, in Figure 6.

      (31) The title may specify "poxvirus".

      Thank you for the suggestion to specify the nature of our experiment, we have adjusted the title to: Systematic genetic characterization of the human PKR kinase domain highlights its functional malleability to escape a poxvirus substrate mimic (line 3).

      Reviewer #3 (Recommendations For The Authors):

      (32) No line numbers or page numbers are provided, which makes it difficult to comment.

      We sincerely apologize for this oversight and have included line numbers in our revised manuscript as well as the tracked changes document.

      (33) In the introduction, I recommend defining evolutionary arms races more clearly for a broad audience.

      Thank you for this suggestion. We have revised the manuscript in the first and third paragraphs to more clearly introduce readers to the concept of an evolutionary arms race.

      (34) The introduction could use a clearer statement of the question being considered and the gap in knowledge this paper is trying to address. Currently, the third paragraph includes many facts about PKR and the fourth paragraph jumps straight into the approach and results. Some elaboration here would convey the significance of the study more clearly. As is, the introduction reads a bit like "We wanted to do deep mutational scanning. PKR seemed like an ok protein to look at", rather than conveying a scientific question.

      This is a great suggestion to improve the introduction section. We have heavily revised the third and fourth paragraphs of the introduction to clarify the motivation, approach, and significance of our work.

      (35) Relatedly, did the authors have any hypotheses at the start of the experiment about what kinds of results they expected? e.g. What parts of PKR would be most likely to generate escape mutants? Would resistant mutants be rare or common? etc? This would help the reader to understand which results are expected vs. surprising.

      These are all great questions. We have revised the introduction of the manuscript to point out that previous studies have characterized a handful of PKR variants that evade vaccinia K3, and these variants were made at sites found to be under positive selection (lines 60-64).

      (36) A description of the different K3 variants and information about why they were chosen for study should also be added to the Introduction. It was not until Figure 5 that the reader was told that K3-H47R was the same as the 'enhanced' K3 allele you are testing.

      Thank you for bringing this to our attention, we have revised the introduction to clarify the experimental conditions (lines 65-67) and specify K3-H47R as the enhanced allele earlier in the manuscript (line 100).

      (37) Does every PKR include just a single point mutation? It would be nice to see data about the number and types of mutations in each PRK window added to Supplemental Figure 1.

      Thank you for the suggestion to improve this figure. Every PKR variant that we track has a single point mutation that generates a nonsynonymous mutation. In our PacBio sequencing of the PKR variant library we identified a few off-target variants or sequences with multiple variants, but we identified the barcodes linked to those constructs and discarded those variants in our analysis. We have revised Supplemental Figure 1 to include the number and types of mutations made at each PKR window.

      (38) In terms of the paper's logical flow, personally, I would expect to begin by testing which variants break PKR's function (Figure 3) and then proceeding to see which variants allow for K3 escape (Figure 2). Consider swapping the order of these sections.

      Thank you for this suggestion, and we can appreciate how the flow of the manuscript may be improved by swapping Figures 2 and 3. We have decided to maintain the current order of the figures because we use Figure 3 to emphasize the distinction of PKR sites that are nonfunctional versus susceptible to vaccinia K3.

      (39) Figure 3A seems like a less-informative version of Figure 4A, recommend combining these two. Same comment with Figure 5A and Figure 6A.

      We appreciate this specific feedback for the figures. Though there are similarities between figure panels (e.g. 3A and 4A) we use them to emphasize different points in each figure. For example, in Figure 3 we emphasize the general lack of variants that impair PKR kinase activity, and in Figure 4 we distinguish kinase-impaired variants from K3-susceptible variants. For this reason, and given space constraints, we have chosen to maintain the figures separately. We did decide to move the former Figure 6 to the supplement.

      (40) In general, it felt like there was a lot of repetition/re-graphing of the same data in Figures 3-6. I recommend condensing some of this, and/or moving some of the panels to supplemental figures.

      Thank you for your suggestion, we have revised the manuscript and have moved Figure 6 to Supplemental Figure 7.

      (41) In contrast, Supplemental Figure 7 is helpful for understanding the distribution of the data. Recommend moving to the main text.

      This is a great recommendation, and we have moved Supplemental Figure 7 into Figure 6.

      (42) How do the authors interpret an enrichment of positively selected sites in K3-resistant variants, but not K3-H74R-resistant variants? This seems important. Please explain.

      Thank you for this suggestion to improve the manuscript; we agree that this observation warranted further exploration. We found a strong correlation in PKR functional scores between K3 WT and K3-H47R, and with that we find sites under positive selection that are resistant to K3 WT are also resistant to K3-H47R. The lack of enrichment at positively selected sites appears to be caused by collapsed dynamic range between PKR wild-type-like and nonfunctional variants in the K3-H47R screen. We have revised the manuscript to clarify this point (line 202-204).

      (43) Discussion: The authors compare and contrast between PKR and ACE2, but it would be worth mentioning other examples of genes involved in antiviral arms races wherein flexible, unstructured loops are functionally important and are hotspots of positive selection (e.g. MxA, NLRP1, etc).

      We greatly appreciate this suggestion to improve the discussion. We note this contrast between the PKR kinase domain and the flexible linkers of MxA and NLRP1 in the revised manuscript (lines 273-274).

      (44) Speculation section: What is the host range of the vaccinia virus? Is it likely to be a generalist amongst many species' PKRs (and if so, how variable are those PKRs)? Would be worth mentioning for context if you want to discuss this topic.

      Thank you for raising this question. Vaccinia virus is the most well studied of the poxviruses, having been used as a vaccine to eradicate smallpox, and serves as a model poxvirus. Vaccinia virus has a broad host range, and though the name vaccinia derives from the Latin word “vacca” for cow the viruses origin remains uncertain (Smith 2007 https://doi.org/10.1007/978-3-7643-7557-7_1). has been used to eradicate smallpox as a vaccine and serves as a model poxvirus. Thought the natural host is unknown, it appears to be a general inhibitor of vertebrate PKRs The natural host of vaccinia virus is unknown, though there is some evidence to suggest it may be native to rabbits and does appear to be generalist.

      (45) Many papers in this field discuss interactions between PKR and K3L, rather than K3. I understand that this is a gene vs. protein nomenclature issue, but consider matching the K3L literature to make this paper easier to find.

      Thank you for bringing this to our attention. We have revised the manuscript to specify that vaccinia K3 is expressed from the K3L gene in both the abstract (line 26) and the introduction (line 56) to help make this paper easier to find when searching for “K3L” literature.

      (46) Which PKR sequence was used as the wild-type background?

      This is a great question. We used the predominant allele circulating in the human population represented by Genbank m85294.1:31-1686. We cite this sequence in the Methods (line 421) and have added it to the results section as well (lines 84).

      (47) Figure 1C: the black dashed line is difficult to see. Recommend changing the colors in 1A-1C.

      Thank you for this suggestion, we have changed the dashed lines from black to white to make them more distinguishable.

      (48) Figure 1D: Part of the point of this figure is to convey overlaps between sites under selection, K3 contact sites, and eIF2alpha contact sites, but at this scale, many of the triangles overlap. It is therefore impossible to tell if the same sites are contacted vs. nearby sites. Perhaps the zoomed-in panels showing each of the four windows in the subsequent figures are sufficient?

      Thank you for bringing this to our attention. We have scaled the triangles down to reduce their overlap in Figure 1D and list all sites of interest (predicted eIF2α and vaccinia contacts, conserved sites, and positive selection sites) in the Materials and Methods section “Predicted PKR complexes and substrate contacts”.

      (49) Figure 1E: under "1,293 Unique Combinations", there is a line between the PKR and K3 variants, which makes it look like they are expressed as a fusion protein. I believe these proteins were expressed from the same plasmid, but not as a fusion, so I recommend re-drawing. Then in the graph, the y-axis says "PKR abundance", but from the figure, it is not clear that this refers to relative abundance in a yeast pool. Perhaps "yeast growth" or similar would be clearer?

      Thank you for the specific feedback to improve Figure 1. We have made the suggested edits to clarify that PKR and vaccinia K3 are not fused but each is expressed from their own promoter. We have also changed the y-axis from “PKR Abundance” to “Yeast Growth”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Over the last decade, numerous studies have identified adaptation signals in modern humans driven by genomic variants introgressed from archaic hominins such as Neanderthals and Denisovans. One of the most classic signals comes from a beneficial haplotype in the EPAS1 gene in Tibetans that is evidently of Denisovan origin and facilitated high altitude adaptation (HAA). Given that HAA is a complex trait with numerous underlying genetic contributions, in this paper Ferraretti et al. asked whether additional HAA-related genes may also exhibit a signature of adaptive introgression. Specifically, the authors considered that if such a signature exists, they most likely are only mild signals from polygenic selection, or soft sweeps on standing archaic variation, in contrast to a strong and nearly complete selection signal like in the EPAS1. Therefore, they leveraged two methods, including a composite likelihood method for detecting adaptive introgression and a biological networkbased method for detecting polygenic selection, and identified two additional genes that harbor plausible signatures of adaptive introgression for HAA.

      Strengths: 

      The study is well motivated by an important question, which is, whether archaic introgression can drive polygenic adaptation via multiple small effect contributions in genes underlying different biological pathways regulating a complex trait (such as HAA). This is a valid question and the influence of archaic introgression on polygenic adaptation has not been thoroughly explored by previous studies.

      The authors reexamined previously published high-altitude Tibetan whole genome data and applied a couple of the recently developed methods for detecting adaptive introgression and polygenic selection. 

      Weaknesses: 

      My main concern with this paper is that I am not too convinced that the reported genomic regions putatively under polygenic selection are indeed of archaic origin. Other than some straightforward population structure characterizations, the authors mainly did two analyses with regard to the identification of adaptive introgression: First, they used one composite likelihood-based method, the VolcanoFinder, to detect the plausible archaic adaptive introgression and found two candidate genes (EP300 and NOS2). Next, they attempted to validate the identified signal using another method that detects polygenic selection based on biological network enrichments for archaic variants.

      In general, I don't see in the manuscript that the choice of methods here are well justified. VolcanoFinder is one among the several commonly used methods for detecting adaptive introgression (eg. the D, RD, U, and Q statistics, genomatnn, maldapt etc.). Even if the selection was mild and incomplete, some of these other methods should be able to recapitulate and validate the results, which are currently missing in this paper. Besides, some of the recent papers that studied the distribution of archaic ancestry in Tibetans don't seem to report archaic segments in the two gene regions. These all together made me not sure about the presence of archaic introgression, in contrast to just selection on ancestral variation.

      Furthermore, the authors tried to validate the results by using signet, a method that detects enrichments of alleles under selection in a set of biological networks related to the trait. However, the authors did not provide sufficient description on how they defined archaic alleles when scoring the genes in the network. In fact, reading from the method description, they seemed to only have considered alleles shared between Tibetans and Denisovans, but not necessarily exclusively shared between them. If the alleles used for scoring the networks in Signet are also found in other populations such as Han Chinese or Africans, then that would make a substantial difference in the result, leading to potential false positives.

      Overall, given the evidence provided by this article, I am not sure they are adequate to suggest archaic adaptive introgression. I recommend additional analyses for the authors to consider for rigorously testing their hypothesis. Please see the details in my review to the authors. 

      Reviewer #2 (Public Review):

      In Ferrareti et al. they identify adaptively introgressed genes using VolcanoFinder and then identify pathways enriched for adaptively introgressed genes. They also use a signet to identify pathways that are enriched for Denisovan alleles. The authors find that angiogenesis and nitric oxide induction are enriched for archaic introgression.

      Strengths: 

      Most papers that have studied the genetic basis of high altitude (HA) adaptation in Tibet have highly emphasized the role of a few genes (e.g. EPAS1, EGLN1), and in this paper, the authors look for more subtle signals in other genes (e.g EP300, NOS2) to investigate how archaic introgression may be enriched at the pathway level.

      Looking into the biological functions enriched for Denisovan introgression in Tibetans is important for characterizing the impact of Denisovan introgression.

      Weaknesses: 

      The manuscript lacks details or justification about how/why some of the analyses were performed. Below are some examples where the authors could provide additional details.

      The authors made specific choices in their window analysis. These choices are not justified or there is no comment as to how results might change if these choices were perturbed. For example, in the methods, the authors write "Then, the genome was divided into 200 kb windows with an overlap of 50 kb and for each of them we calculated the ratio between the number of significant SNVs and the total number of variants." 

      Additional information is needed for clarity. For example, "we considered only protein-protein interactions showing confidence scores {greater than or equal to} 0.7 and the obtained protein frameworks were integrated using information available in the literature regarding the functional role of the related genes and their possible involvement in high-altitude adaptation." What do the confidence scores mean? Why 0.7?

      In the method section (Identifying gene networks enriched for Denisovan-like derived alleles), the authors write "To validate VolcanoFinder results by using an independent approach". Does this mean that for signet the authors do not use the regions identified as adaptively introgressed using volcanofinder? I thought in the original signet paper, the authors used a summary describing the amount of introgression of a given region.

      Later, the authors write "To do so, we first compared the Tibetan and Denisovan genomes to assess which SNVs were present in both modern and archaic sequences. These loci were further compared with the ancestral reconstructed reference human genome sequence (1000 Genomes Project Consortium et al., 2015) to discard those presenting an ancestral state (i.e., that we have in common with several primate species)." It is not clear why the authors are citing the 1000 genomes project. Are they comparing with the reference human genome reference or with all populations in the 1000 genomes project? Also, are the authors allowing derived alleles that are shared with Africans? Typically, populations from Africa are used as controls since the Denisovan introgression occurred in Eurasia.

      The methods section for Figures 4B, 4C, and 4D is a little hard to understand. What is the x-axis on these plots? Is it the number of pairwise differences to Denisovan? The caption is not clear here. The authors mention that "Conversely, for non-introgressed loci (e.g., EGLN1), we might expect a remarkably different pattern of haplotypes distribution, with almost all haplotype classes presenting a larger proportion of non-Tibetan haplotypes rather than Tibetan ones." There is clearly structure in EGLN1. There is a group of non-Tibetan haplotypes that are closer to Denisovan and a group of Tibetan haplotypes that are distant from Denisovan...How do the authors interpret this? 

      In the original signet paper (Guoy and Excoffier 2017), they apply signet to data from Tibetans. Zhang et al. PNAS (2021) also applied it to Tibetans. It would be helpful to highlight how the approach here is different. 

      We thank the Reviewers for having appreciated the rationale of our study and to have identified potential issues that deserve to be addressed in order to better focus on robust results specifically supported by multiple approaches.

      First, we agree with the Reviewers that clarification and justification for the methodologies adopted in the present study should be deepened with respect to what done in the original version of the manuscript, with the purpose of making it more intelligible for a broad range of scientists. As reported thoroughly in the revised version of the text, the VolcanoFinder algorithm, which we used as the primary method to discover new candidate genomic regions affected by events of adaptive introgression, was chosen among several approaches developed to detect signatures ascribable to such an evolutionary process according to the following reasons: i) VolcanoFinder is one of the few methods that can test jointly events of both archaic introgression and adaptive evolution (e.g., the D statistic cannot formally test for the action of natural selection, having been also developed to provide genome wide estimates of allele sharing between archaic and modern groups rather than to identify specific genomic regions enriched for introgressed alleles); ii) the model tested by the VolcanoFinder algorithm remarkably differs from those considered by other methods typically used to test for adaptive introgression, such as the RD, U and Q statistics, which are aimed at identifying chromosomal segments showing low divergence with respect to a specific archaic sequence and/or enriched in alleles uniquely shared between the admixed group and the source population, as well as characterized by a frequency above a certain threshold in the population under study, thus being useful especially to test an evolutionary scenario conformed to that expected in the case that adaptation was mediated by strong selective sweeps rather than weak polygenic mechanisms (see answer to comment #1 of Reviewer #1 for further details); iii) VolcanoFinder relies on less demanding computational efforts respect to other algorithms, such as genomatnn and Maladapt, which also require to be trained on large genomic simulations built specifically to reflect the evolutionary history of the population under study, thus increasing the possibility to introduce bias in the obtained results if the information that guides simulation approaches is not accurate.

      Despite that, we agree with Reviewer #2 that some criteria formerly implemented during the filtering of VolcanoFinder results (e.g., normalization of LR scores, use of a sliding windows approach, and implementation of enrichment analysis based on specific confidence scores) might introduce erratic changes, which depend on the thresholds adopted, in the list of the genomic regions considered as the most likely candidates to have experienced adaptive introgression. To avoid this issue, and to adhere more strictly to the VolcanoFinder pipeline of analyses developed by Setter et al. 2020, in the revised version of the manuscript we have opted to use raw LR scores and to shortlist the most significant results by focusing on loci showing values falling in the top 5% of the genomic distribution obtained for such a statistic (see Materials and methods for details). 

      Moreover, to further reduce the use of potential arbitrary filtering thresholds we decided to do not implement functional enrichment analysis to prioritize results from the VolcanoFinder method. To this end, although a STRING confidence score (i.e., the approximate probability that a predicted interaction exists between two proteins belonging to the same functional pathway according to information stored in the KEGG database) above 0.7 is generally considered a high confidence score (string-db.org, Szklarczyk et al. 2014), we replaced such a prioritization criterion by considering as the most robust candidates for adaptive introgression only those genomic regions that turned out to be supported by all the approaches used (i.e., VolcanoFinder, Signet, LASSI and Haplostrips analyses).

      According to the Reviewers’ comments on the use of the Signet algorithm, we realized that the rationale beyond such a validation approach was not well described in the original version of the manuscript. First and foremost, we would like to clarify that in the present study we did not use this method to test for the action of natural selection (as it was formerly used by Gouy et al. 2017), but specifically to identify genomic regions putatively affected by archaic introgression. For this purpose, we followed the approach described by Gouy and Excoffier 2020 by searching for significant networks of genes presenting archaic-derived variants observable in the considered Tibetan populations but not in an outgroup population of African ancestry. Accordingly, we used the Signet method as an independent approach to obtain a first validation of introgressed (but not necessarily adaptive) loci pointed out by VolcanoFinder results. 

      In detail, in response to the question by Reviewer #2 about which genomic regions have been considered in the Signet analysis, it is necessary to clarify that to obtain the input score associated to each gene along the genome, as required by the algorithm, we calculated average frequency values per gene by considering all the archaic-derived alleles included in the Tibetan dataset but not in the outgroup one. Therefore, we did not take into account only those loci identified as significant by VolcanoFinder analysis, but we performed an independent genome scan. Then, we crosschecked significant results from VolcanoFinder and Signet approaches and we shortlisted the genomic regions supported by both. This approach thus differs from that of Zhang et al. 2021 in which the input scores per gene were obtained by considering only those loci previously pointed out by another method as putatively introgressed. Moreover, as mentioned in the previous paragraph, our approach differs also from that implemented by Guoy et al. 2017, in which the input scores assigned to each gene were represented by the variants showing the smallest P-value associated to a selection statistic, being thus informative about putative adaptive events but not introgression ones.

      However, as correctly pointed out by both the Reviewers, we formerly performed Signet analysis by considering derived alleles shared between Tibetans and the Denisovan species, without filtering out those alleles that are observed also in other modern human populations. We agree with the Reviewers that this approach cannot rule out the possibility of retaining false positive results ascribable to ancestral polymorphisms rather than introgressed alleles. According to the Reviewers’ suggestion, we thus repeated the Signet analysis by removing derived alleles observed also in an outgroup population of African ancestry (i.e., Yoruba), by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture. In detail, we considered only those alleles that: i) were shared between Tibetans and Denisovan (i.e., Denisovan-like alleles); ii) were assumed to be derived according to the comparison with the ancestral reconstructed reference human genome sequence; iii) were completely absent (i.e., present frequency equal to zero) in the Yoruba population sequenced by the 1000 Genomes Project. Despite the comment of Reviewer #1 seems to propose the possible use of Han Chinese as a further control population, we decided to do not filter out Denisovan-like derived alleles present also in this human group because evidence collected so far suggest that Denisovan introgression in the gene pool of East Asian ancestors predated the split between low-altitude and high-altitude populations (Lu et al. 2016; Hu et al. 2017) and, as mentioned before, we aimed at using the Signet algorithm to validate introgression events rather than adaptive ones (see the answer to comment #6 of Reviewer #1 for further details). Moreover, we would like to remark that we decided to maintain the Signet analysis as a validation method in the revised version of the manuscript because: i) comments from both the Reviewers converge in suggesting how to effectively improve this approach, and ii) it represents a method that goes beyond the simple identification of single putative introgressed alleles, by instead enabling us to point out those biological functions that might have been collectively shaped by gene flow from Denisovans.

      In addition to validate genomic regions putatively affected by archaic introgression by crosschecking results from the VolcanoFinder and Signet analyses, according to the suggestion by Reviewer #1 we implemented a further validation procedure aimed at formally testing for the adaptive evolution of the identified candidate introgressed loci. For this purpose, we applied the LASSI likelihood haplotype based method (Harris & DeGiorgio 2020) to Tibetan whole genome data. Notably, we choose this approach mainly for the following reasons: i) because it is able to detect and distinguish genomic regions that have experienced different types of selective events (i.e. strong and weak ones); ii) it has been demonstrated to have increased power in identifying them with respect to other selection statistics (e.g., H12 and nSL) (Harris & DeGiorgio 2020). Again, we performed an independent genome scan using the LASSI algorithm and then we crosschecked the obtained significant results with those previously supported by VolcanoFinder and Signet approaches in order to shortlist genomic regions that have plausibly experienced both archaic introgression and adaptive evolution.

      Moreover, we maintained a final validation step represented by Haplostrips analysis, which was instead specifically performed on chromosomal segments supported by results from both VolcanoFinder, Signet, and LASSI approaches. This enabled us to assess the similarity between Denisovan haplotypes and those observed in Tibetans (i.e., the population under study in which archaic alleles might have played an adaptive role in response to high-altitude selective pressures), Han Chinese (i.e., a sister group whose common ancestors with Tibetans have experienced Denisovan admixture, but have then evolved at low altitude), and Yoruba (i.e., an outgroup that is assumed to have not received gene flow from Denisovans). 

      In conclusion, we believe that the substantial changes incorporated in the manuscript according to the Reviewers’ suggestions strongly improved the study by enabling us to focus on more solid results with respect to those formerly presented. Interestingly, although the single candidate loci supported by all the approaches now implemented for validating the obtained results have attained higher prioritization with respect to previous ones (which are supported by some but not all the adopted methods), angiogenesis still stands out as the one of the main biological functions that have been shaped by events of adaptive introgression in human groups of Tibetan ancestry. This provides new evidence for the contribution of introgressed Denisovan alleles other than the EPAS1 ones in modulating the complex adaptive responses evolved by Himalayan populations to cope with selective pressures imposed by high altitudes.

      Responses to Recommendations For The Authors:

      Reviewer #1:

      The authors mainly relied on one method, VolcanoFinder (VF), to detect adaptive introgression signals. As one of the recently developed methods, VF indeed demonstrated statistical power at detecting mild selection on archaic variants, as well as detecting soft sweeps on standing variations. However, compared to other commonly used methods for detecting adaptive introgression, such as the U and Q stats (Racimo et al. 2017), genomatnn (Gower et al. 2021), or MaLAdapt (Zhang et al. 2023),

      VF doesn't seem to have better power at capturing mild and incomplete sweeps. And it makes me wonder about the justification for choosing VF over other methods here, which is not clearly explained in the manuscript. If these adaptive introgression candidates are legitimate, even if the signals are mild, at least some of the other methods should be able to recapitulate the signature (even if they don't necessarily make it through the genome-wide significance thresholds). I would be more convinced about the archaic origin of these regions if the authors could validate their reported findings using some of the aforementioned other methods. 

      According to the Reviewer’s suggestion, in the revised version of the manuscript we have expanded the considerations reported as concern the rationale that guided the choice of the adopted methods. In particular, in the Materials and methods section (see page 12) we have specificed the reasons for having used the VolcanoFinder algorithm. 

      First, it represents one of the few approaches that relies on a model able to test jointly the occurrence of archaic introgression and the adaptive evolution of the genomic regions affected by archaic gene flow, without the need for considering the putative source of introgression. This was a relevant aspect for us, beacuse we planned to adopt at least two main independent (and possibly quite different in terms of the underlying approaches) methods to validate the identified candidate intregressed loci and the other algorithm we used (i.e., Signet) was explicitly based on the comparison of modern data with the archaic sequence. Accordingly, the model tested by VolcanoFinder differs from those considered by the RD, U and Q statistics. In fact, RD statistic is aimed at identifying regions of the genome with low divergence with respect to a given archaic reference, while the U/Q statistics can detect those chromosomal segments enriched in alleles that are i) uniquely shared between the admixed group (e.g., Tibetans) and the source population (e.g., Denisovans), and ii) that present a frequency above a specific threshold in the admixed population (Racimo et al. 2016). For instance, all the loci considered as likely involved in adaptive introgression events by Racimo et al. 2016 presented remarkable frequencies, with most of them showing values above 50%. That being so, we decided to do not implement these methods because we believe that they are more suitable for the detection of adaptive introgression events involving few variants with a strong effect on the phenotype, which comport a substantial increase in frequency in the population subjected to the selective pressure (i.e., cases such as that of  EPAS1), while it appears challenging to choose an arbitrary frequency threshold appropriate for the detection of weak and/or polygenic selective events. 

      As regards the possible use of Maladapt or genomatnn approaches as validation methods, we believe that they rely on more demanding computational efforts with respect to the Signet algorithm and, above all, they have the disadvantage of requiring to be trained on simulated genomic data. This makes them more prone to the potential bias introduced in the obtained results by simulations that do not carefully reflect the evolutionary history of the population under study.

      Overall, we do not agree with the Reviwer’s statement about the fact that we mainly relied on a single method to detect adaptive introgression signals because, as mentioned above, the Signet algorithm was specifically used to identify genomic regions putatively affected by introgression. This method relies on assumptions very similar to those described above for the U/Q statistics (e.g. it considers alleles uniquely shared between Tibetans and Denisovans), but avoids the necessity to select a frequency threshold to shortlist the most likely adaptive intregressed loci. In addition, according to another suggestion by the Reviewer we have now implemented a further approach to provide evidence for the adaptive evolution of the candidate introgressed loci (see response to comment #3).  

      As regards the use of Signet, based on comments from both the Reviewers we realized that the rationale beyond such a validation approach was not well described in the original version of the manuscript. First and foremost, we would like to clarify that in the present study we did not use this method to test for the action of natural selection (as it was formerly used by Gouy et al. 2017), but specifically to identify genomic regions putatively affected by archaic introgression. For this purpose, we followed the approach described by Gouy and Excoffier (2020) by searching for significant networks of genes presenting archaic-derived variants observable in the considered Tibetan populations. That being so, we used the Signet method as an independent approach to obtain a first validation of VolcanoFinder results. However, by following suggestions from both the Reviweres, we modified the criteria adopted to filter for archaic-derived variants, by excluding those alleles in common between Denisovan and the Yoruba outgroup population (see response to comment #6 for further information regarding this aspect). 

      To sum up, we think that the combination of VolcanoFinder and Signet+LASSI approaches offered a good compromise between required computational efforts to shortlist the most robust candidates of adaptive introgressed loci and the typologies of model tested (i.e. that does not diascard a priori genomic signatures ascribable to weak and/or polygenic selective events). Morevoer, we would like to remark that we decided to maintain the Signet method as a validation approach in the revised version of the manuscript because: i) comments from both the Reviewers converge in suggesting how to effectively improve this approach, and ii) it represents a method that can be used to perform both single-locus validation analysis and to search for those biological functions that have been collectively much more impacted by archaic introgression, allowing to test a more realistic approximation of the polygenic model of adaptation involving introgressed alleles. In fact, although the single candidate loci supported by all the approaches now implemented for validating the obtained results  (see responses to comments #3 and #7 for further details) have attained higher prioritization with respect to previous ones (i.e., EP300 and NOS2, which are now supported by some but not all the adopted methods), angiogenesis still stands out as one of the main biological functions that have been shaped by events of adaptive introgression in the ancestors of Tibetan populations. 

      Besides, I am a little surprised to see that in Supplementary Figure 2, VF didn't seem to capture more significant LR values in the EPAS1 region (positive control of adaptive introgression) than in the negative control EGLN1 region. The author explained this as the selection on EPAS1 region is "not soft enough", which I find a bit confusing. If there is no major difference in significant values between the positive and negative controls, how would the authors be convinced the significant values they detected in their two genes are true positives? I would like to see more discussion and justification of the VF results and interpretations.

      In the light of such a Reviewer’s observation and according to the Reviewer #2 overall comment on the procedures implemented for filtering VolcanoFinder results, we realized that both normalization of  LR scores and the use of a sliding windows approach might introduce erratic changes, which depend on the thresholds adopted, in the list of the genomic regions considered as the most likely candidates to have experienced adaptive introgression. To avoid this issue, and to adhere more strictly to the VolcanoFinder pipeline of analyses developed by Setter et al. 2020, in the revised version of the manuscript we have opted to use raw LR scores and to shortlist the most significant results by focusing on loci showing values falling in the top 5% of the genomic distribution obtained for such a statistic (see Materials and methods, page 13 lines 4 -16 for further details).

      By following this approach, we indeed observed a pattern clearer than that previously described, in which the distribution of LR scores in the EPAS1 genomic region is remarkably different with respect to that obtained for the EGLN1 gene (Figure 2 – figure supplement 1). More in detail, we identified a total of 19 EPAS1 variants showing scores within the top 5% of LR values, in contrast to only three EGLN1 SNVs. Moreover, LR values were collectively more aggregated in the EPAS1 genomic region and showed a higher average value with respect to what observed for EGLN1. We reported LR values, as well as -log (a) scores calculated for these control genes in Supplement tables 3 and 4.

      Nevertheless, we agree with the Reviewer that results pointed out by VolcanoFinder require to be confirmed by additional methods, which is was what we have done to define both new candidate adaptive intregressed loci and the considered positive/negative controls. In fact, validation analyses performed to confirm signatures of both archaic introgression and adaptive evolution (i.e., Signet, LASSI and Haplostrips) converged in indicating that Tibetan variability at the EGLN1 gene does not seem to have been shaped by archaic introgression events but only by the action of natural selection (see Results, page 5 lines 3-9, page 6 lines 23-25, page 7 lines 29-36; Discussion page 14 lines 33-36; Figure 2 – figure supplement 1B and Figure 4 – figure supplement 1B, 3B and 3D), also according to what was previously proposed (Hu et al., 2017). On the other hand, results from all validation analyses confirmed adaptive introgression signatures at the EPAS1 genomic region (see Results page 4 lines 32-37, page 5 lines 1-2 and 30-34, page 6 lines 23-29; Figure 3A, 3B and Figure 4 – figure supplement 1A, 3A and 3C). 

      Finally, as already reported in the former version of the manuscript, our choice of considering EPAS1 and EGLN1 respectively as positive and negative controls for adaptive introgression was guided by previous evidence suggesting these loci as targets of natural selection in high-altitude Himalayan populations (Yang et al., 2017; Liu et al., 2022), although only EPAS1 was proved to have been involved also in an adaptive introgression event (Huerta-Sanchez et al., 2014; Hu et al., 2017). 

      With that being said, I suggest the authors try to first validate the signal of positive selection in the two gene regions using methods such as H2/H1 (Garud et al. 2015), iHS (Voight et al. 2006) etc. that have demonstrated power and success at detecting mild sweeps and soft sweeps, regardless of if these are adaptive introgression.

      According to the Reviewer’s suggestion, we validated the new candidate adaptive introgressed loci by using also a method to formally test for the action of natural selection. In particular, we decided to use the LASSI (Likelihood-based Approach for Selective Sweep Inference) algorithm developed by Harris & DeGiorgio (2020) mainly for the following reasons: i) it is able to identify both strong and weak genomic signatures of positive selection similarly to others approaches, but additionally it can distinguish these signals by explicitly classifying genomic windows affected by hard or soft selective sweeps; ii) when applied on simulated data generated under different demographic models and by setting a range of different values for the parameters that describe a selective event (e.g., the time at which the beneficial mutation arose, the selection coefficient s) it has been proved to have an increased power with respect to traditional selection scans, such as nSL, H2/H1 and H12 (see Harris & DeGiorgio 2020 for further details).  

      According to such an approach, we were able to recapitulate signatures of natural selection previously observed in Tibetans for both EPAS1 and EGLN1 (Figure 4 – figure supplement 1 and 3C – 3D).  We also obtained comparable patterns for our previous candidate adaptive introgressed loci (i.e., EP300 and NOS2), as well as for the new ones that have been instead prioritized in the revised version of the manuscript according to consistent results also from VolcanoFinder, Signet and Haplostrips analyses (see Results, page 6 lines 30-35; Figure 4C, 4D, Figure 4 – figure supplement 2C and 2D).    

      With regard to the plausible archaic origin of the haplotypes under selection in these gene regions, my concern comes from the fact that other recent studies characterizing the archaic ancestry landscape in Tibetans and East Asians (eg. SPrime reports from Browning et al. 2018, as well as ArchaicSeeker reports from Yuan et al. 2021) didn't report archaic segments in regions overlapping with EP300 and NOS2. So how would the authors explain the discrepancy here, that adaptive introgression is detected yet there is little evidence of archaic segments in the regions? 

      We thank the Reviewer for the comment and the references provided. However, we read the suggested articles and in both of them it does not seem that genomes from individuals of Tibetan ancestry have been analysed. Moreover, in the study by Yuan et al. 2021 we were not able to find any table or supplementary table reporting the genomic segments showing signatures of Denisovan-like introgression in East Asian groups, with only findings from enrichment analyses performed on significant results being described for the Papuan population. Anyway, as reported below in the response to comment #5, in line with what observed by the Reviwer as concerns the original version of the manuscript, according to the additional validation analyses implemented during this revison EP300 and NOS2 received lower prioritization with respect to other loci showing more robust signatures supporting introgression of Denisovan alleles in the gene pool of Tibetan ancestors (i.e., TBC1D1, PRKAG2, KRAS and RASGRF2). Three out of four of these genes are in accordance also with previously published results supporting introgression of Denisovan alleles in the ancestors of present-day Han Chinese (Browning et al. 2018) or directly in the Tibetan genomes (Hu et al. 2017) (see Results, page 5 lines 10-21 and Supplement table 5). Despite that, the reason why not all the candidate adaptive introgression regions detected by our analyses are found among results from Browning et al. 2018 can be represented by the fact that in Han Chinese this archaic variation could have evolved neutrally after the introgression events, thus preventing the identification of chromosomal segments enriched in putative archaic introgressed variants according to VolcanoFinder and LASSI approaches (which consider also the impact of natural selection). In fact, the Sprime method implemented by Browning et al. 2018 focuses only on introgression events rather than adaptive introgression ones. For instance, the Denisovan-like regions identified with Sprime in Han Chinese by such a study do not comprise at all the EPAS1 region. 

      Additionally, looking at Figure 4 and Supplementary Figure 4, the authors showed haplotype comparisons between Tibetans, Denisovan, and Han Chinese for EP300 and NOS2 regions. However, in both figures, there are about equal number of Tibetans and Han Chinese that harbor the haplotype with somewhat close distance to the Denisovan genotype. And this closest haplotype is not even that similar to the Denisovan. So how would the authors rule out the possibility that instead of adaptive introgression, the selection was acting on just an ancestral modern human haplotype?

      We agree with the Reviewer that according to the analyses presented in the original version of the manuscript haplotype patterns observed at EP300 and NOS2 loci by means of the Haplostrips approach cannot ruled out the possibility that their adaptative evolution involved ancestral modern human haplotypes. In fact, after the modifications implemented in the adopted pipeline of analyses based on the Reviewers’ suggestions, their role in modulating complex adaptations to high-altitudes was confirmed also by results obtained with the LASSI algorithm (in addition to results from previous studies Bigham et al., 2010; Zheng et al., 2017; Deng et al., 2019; X. Zhang et al., 2020), but their putative archaic origin received lower prioritization with respect to other loci, being not confirmed by all the analyses performed.

      Furthermore, I have a question about how exactly the authors scored the genes in their network analysis using Signet. The manuscript mentioned they were looking for enrichment of archaic-like derived alleles, and in the methods section, they mentioned they used SNPs that are present in both Denisovan and Tibetan genomes but are not in the chimp ancestral allele state. But are these "derived" alleles also present in Han Chinese or Africans? If so, what are the frequencies? And if the authors didn't use derived alleles exclusively shared between Tibetans and Denisovans, that may lead to false positives of the enrichment analysis, as the result would not be able to rule out the selection on ancestral modern human variation.

      As mentioned in the response to comment #1, by following the suggestions of both the Reviewers we have modified the criteria adopted for filtering archaic derived variants exclusively shared between Denisovans and Tibetans. In particular, we retained as input for Signet analysis only those alleles that i) were shared between Tibetans and Denisovan (i.e., Denisovan-like alleles) ii) were in their derived state and iii) were completely absent (i.e., show frequency equal to zero) in the Yoruba population sequenced by the 1000 Genome Project and used here as an outgroup by assuming that only Eurasian H. sapiens populations experienced Denisovan admixture. We instead decided to do not filter out potential Denisovan-like derived alleles present also in the Han Chinese population because multiple evidence agreed at indicating that gene flow from Denisovans occurred in the ancestral East Asian gene pool no sooner than 48–46 thousand years ago (Teixeira et al. 2019; Zhang et al. 2021; Yuan et al. 2021), thus predating the split between low-altitude and high-altitude groups, which occurred approximately 15 thousand years ago (Lu et al. 2016; Hu et al. 2017). In fact, traces of such an archaic gene-flow are still detectable in the genomes of several low-altitude populations of East Asian ancestry (Yuan et al. 2021).

      Concerning the above, I would also suggest the authors replot their Figure 4 and Figure S4 by adding the African population (eg. YRI) in the plot, and examine the genetic distance among the modern human haplotypes, in contrast to their distance to Denisovan.

      According to the Reviewer’s suggestion, after having identified new candidate adaptive introgressed loci according to the revised pipeline of analyses, we run the Haplostrips algorithm by including in the dataset 27 individuals (i.e., 54 haplotypes) from the Yoruba population sequenced by the 1000 Genomes Project (Figure 4A, 4B, Figure 4 - figure supplement 2A, 2B, 3A).

      Reviewer #2:

      In the methods the authors write "Since composite likelihood statistics are not associated with pvalues, we implemented multiple procedures to filter SNVs according to the significance of their LR values." What does significance mean here?

      After modifications applied to the adopted pipeline of analyses according to the Reviewers’ suggestions (see responses to public reviews and to comments #1, #3, #6, #7 of Reviewer #1), new candidate adaptive introgressed loci have been identified specifically by focusing on variants showing LR values falling in the top 5% of the genomic distribution obtained for such a statistic in order to adhere more strictly to the VolcanoFinder approach developed by Setter et al. 2020. Therefore, the related sentence in the materials and methods section was modified accordingly.

      Signet should be cited the first time it appears in the manuscript. The citation in the references is wrong. It lists R. Nielsen as the last author, but R. Nielsen is not an author of this paper.

      We thank the Reviewer for the comment. We have now mentioned the article by Gouy and Excoffier (2020) in the Results section where the Signet algorithm was first described and we have corrected the related reference.

      I could not find Figure 5 which is cited in the methods in the main text. I assume the authors mean Supplementary Figure 5, but the supplementary files have Figure 4.

      We thank the Reviewer for the comment. We have checked and modified figures included in the article and in the supplementary files to fix this issue.

      I didn't see a table with the genes identified as adaptatively introgressed with VolcanoFinder. This would be useful as I believe this is the first time VolcanoFinder is being used on Tibetan data?

      According to the Reviewer suggestion, we have reported in Supplement table 2 all the variants showing LR scores falling in the top 5% of the genomic distribution obtained for such a statistic, along with the associated α parameters computed by the VolcanoFinder algorithm.

      It is easier for the reviewer if lines have numbers.

      According to the Reviewer suggestion, we have included line numbers in the revised version of the manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors investigated how the presence of interspecific introgressions in the genome affects the recombination landscape. This research was intended to inform about genetic phenomena influencing the evolution of introgressed regions, although it should be noted that the research itself is based on examining only one generation, which limits the possibility of drawing far-reaching evolutionary conclusions. In this work, yeast hybrids with large (from several to several dozen percent of the chromosome length) introgressions from another yeast species were crossed. Then, the products of meiosis were isolated and sequenced, and on this basis, the genome-wide distribution of both crossovers (COs) and noncrossovers (NCOs) was examined. Carrying out the analysis at different levels of resolution, it was found that in the regions of introduction, there is a very significant reduction in the frequency of COs and a simultaneous increase in the frequency of NCOs. Moreover, it was confirmed that introgressions significantly limit the local shuffling of genetic information, and NCOs are only able to slightly contribute to the shuffling, thus they do not compensate for the loss of CO recombination.

      Strengths:

      - Previously, experiments examining the impact of SNP polymorphism on meiotic recombination were conducted either on the scale of single hotspots or the entire hybrid genome, but the impact of large introgressed regions from another species was not examined. Therefore, the strength of this work is its interesting research setup, which allows for providing data from a different perspective.

      - Good quality genome-wide data on the distribution of CO and NCO were obtained, which could be related to local changes in the level of polymorphism.

      Weaknesses:

      (1)  The research is based on examining only one generation, which limits the possibility of drawing far-reaching evolutionary conclusions. Moreover, meiosis is stimulated in hybrids in which introgressions occur in a heterozygous state, which is a very unlikely situation in nature. Therefore, I see the main value of the work in providing information on the CO/NCO decision in regions with high sequence diversification, but not in the context of evolution.

      While we are indeed only examining recombination in a single generation, we respectfully disagree that our results aren't relevant to evolutionary processes. The broad goals of our study are to compare recombination landscapes between closely related strains, and we highlight dramatic differences between recombination landscapes. These results add to a body of literature that seeks to understand the existence of variation in traits like recombination rate, and how recombination rate can evolve between populations and species. We show here that the presence of introgression can contribute to changes in recombination rate measured in different individuals or populations, which has not been previously appreciated. We furthermore show that introgression can reduce shuffling between alleles on a chromosome, which is recognized as one of the most important determinants for the existence and persistence of sexual reproduction across all organisms. As we describe in our introduction and conclusion, we see our experimental exploration of the impacts of introgression on the recombination landscape as complementary to studies inferring recombination and introgression from population sequencing data and simulations. There are benefits and challenges to each approach, but both can help us better understand these processes. In regards to the utility of exploring heterozygous introgression, we point out that introgression is often found in a heterozygous state (including in modern humans with Neanderthal and/or Denisovan ancestry). Introgression will always be heterozygous immediately after hybridization, and depending on the frequency of gene flow into the population, the level of inbreeding, selection against introgression, etc., introgression will typically be found as heterozygous.

      - The work requires greater care in preparing informative figures and, more importantly, re-analysis of some of the data (see comments below).

      More specific comments:

      (1) The authors themselves admit that the detection of NCO, due to the short size of conversion tracts, depends on the density of SNPs in a given region. Consequently, more NCOs will be detected in introgressed regions with a high density of polymorphisms compared to the rest of the genome. To investigate what impact this has on the analysis, the authors should demonstrate that the efficiency of detecting NCOs in introgressed regions is not significantly higher than the efficiency of detecting NCOs in the rest of the genome. If it turns out that this impact is significant, analyses should be presented proving that it does not entirely explain the increase in the frequency of NCOs in introgressed regions.

      We conducted a deeper exploration of the effect of marker resolution on NCO detection by randomly removing different proportions of markers from introgressed regions of the fermentation cross in order to simulate different marker resolutions from non-introgressed regions. We chose proportions of markers that would simulate different quantiles of the resolution of non-introgressed regions and repeated our standard pipeline in order to compare our NCO detection at the chosen marker densities. More details of this analysis have been added to the manuscript (lines 188-199, 525-538). We confirmed the effect of marker resolution on NCO detection (as reported in the updated manuscript and new supplementary figures S2-S10, new Table S10) and decided to repeat our analyses on the original data with a more stringent correction. For this we chose our observed average tract size for NCOs in introgressed regions (550bp), which leads to a far more conservative estimate of NCO counts (As seen in the updated Figure 2 and Table 2). This better accounts for the increased resolution in introgressed regions, and while it's possible to be more stringent with our corrections, we believe that further stringency would be unreasonable. We also see promising signs that the correction is sufficient when counting our CO and NCO events in both crosses, as described in our response to comment 39 (response to reviewer #3).

      (2) CO and NCO analyses performed separately for individual regions rarely show statistical significance (Figures 3 and 4). I think that the authors, after dividing the introgressed regions into non-overlapping windows of 100 bp (I suggest also trying 200 bp, 500 bp, and 1kb windows), should combine the data for all regions and perform correlations to SNP density in each window for the whole set of data. Such an analysis has a greater chance of demonstrating statistically significant relationships. This could replace the analysis presented in Figure 3 (which can be moved to Supplement). Moreover, the analysis should also take into account indels.

      We're uncertain of what is being requested here. If the comment refers to the effect of marker density on NCO detection, we hope the response to comment 2 will help resolve this comment as well. Otherwise, we ask for some clarification so that we may correct or revise as appropriate.

      (3) In Arabidopsis, it has been shown that crossover is stimulated in heterozygous regions that are adjacent to homozygous regions on the same chromosome (http://dx.doi.org/10.7554/eLife.03708.001, https://doi.org/10.1038/s41467-022-35722-3).

      This effect applies only to class I crossovers, and is reversed for class II crossovers (https://doi.org/10.15252/embj.2020104858, https://doi.org/10.1038/s41467-023-42511-z). This research system is very similar to the system used by the authors, although it likely differs in the level of DNA sequence divergence. The authors could discuss their work in this context.

      We thank the reviewer for sharing these references. We have added a discussion of our work in the context of these findings in the Discussion, lines 367-376.

      Reviewer #2 (Public Review):

      Summary:

      Schwartzkopf et al characterized the meiotic recombination impact of highly heterozygous introgressed regions within the budding yeast Saccharomyces uvarum, a close relative of the canonical model Saccharomyces cerevisiae. To do so, they took advantage of the naturally occurring Saccharomyces bayanus introgressions specifically within fermentation isolates of S. uvarum and compared their behavior to the syntenic regions of a cross between natural isolates that do not contain such introgressions. Analysis of crossover (CO) and noncrossover (NCO) recombination events shows both a depletion in CO frequency within highly heterozygous introgressed regions and an increase in NCO frequency. These results strongly support the hypothesis that DNA sequence polymorphism inhibits CO formation, and has no or much weaker effects on NCO formation. Eventually, the authors show that the presence of introgressions negatively impacts "r", the parameter that reflects the probability that a randomly chosen pair of loci shuffles their alleles in a gamete.

      The authors chose a sound experimental setup that allowed them to directly compare recombination properties of orthologous syntenic regions in an otherwise intra-specific genetic background. The way the analyses have been performed looks right, although this reviewer is unable to judge the relevance of the statistical tests used. Eventually, most of their results which are elegant and of interest to the community are present in Figure 2.

      Strengths:

      Analysis of crossover (CO) and noncrossover (NCO) recombination events is compelling in showing both a depletion in CO frequency within highly heterozygous introgressed regions and an increase in NCO frequency.

      Weaknesses:

      The main weaknesses refer to a few text issues and a lack of discussion about the mechanistic implications of the present findings.

      - Introduction

      (1) The introduction is rather long. | I suggest specifically referring to "meiotic" recombination (line 71) and to "meiotic" DSBs (line 73) since recombination can occur outside of meiosis (ie somatic cells).

      We agree and have condensed the introduction to be more focused. We also made the suggested edits to include “meiotic” when referring to recombination and DSBs.

      (2) From lines 79 to 87: the description of recombination is unnecessarily complex and confusing. I suggest the authors simply remind that DSB repair through homologous recombination is inherently associated with a gene conversion tract (primarily as a result of the repair of heteroduplex DNA by the mismatch repair (MMR) machinery) that can be associated or not to a crossover. The former recombination product is a crossover (CO), the latter product is a noncrossover (NCO) or gene conversion. Limited markers may prevent the detection of gene conversions, which erase NCO but do not affect CO detection.

      We changed the language in this section to reflect the reviewer’s suggestions.

      (3) In addition, "resolution" in the recombination field refers to the processing of a double Holliday junction containing intermediates by structure-specific nucleases. To avoid any confusion, I suggest avoiding using "resolution" and simply sticking with "DSB repair" all along the text.

      We made the suggested correction throughout the paper.

      (4) Note that there are several studies about S. cerevisiae meiotic recombination landscapes using different hybrids that show different CO counts. In the introduction, the authors refer to Mancera et al 2008, a reference paper in the field. In this paper, the hybrid used showed ca. 90 CO per meiosis, while their reference to Liu et al 2018 in Figure 2 shows less than 80 COs per meiosis for S. cerevisiae. This shows that it is not easy to come up with a definitive CO count per meiosis in a given species. This needs to be taken into account for the result section line 315-321.

      This is an excellent point. We added this context in the results (lines 180-187).

      (5) In line 104, the authors refer to S. paradoxus and mention that its recombination rate is significantly different from that of S. cerevisiae. This is inaccurate since this paper claims that the CO landscape is even more conserved than the DSB landscape between these two species, and they even identify a strong role played by the subtelomeric regions. So, the discussion about this paper cannot stand as it is.

      We agree with the reviewer's point. We also found that the entire paragraph was unnecessary, so it and the sentence in question have been removed.

      (6) Line 150, when the authors refer to the anti-recombinogenic activity of the MMR, I suggest referring to the published work from Martini et al 2011 rather than the not-yet-published work from Copper et al 2021, or both, if needed.

      Added the suggested citation.

      Results

      (7) The clear depletion in CO and the concomitant increase in NCO within the introgressed regions strongly suggest that DNA sequence polymorphism triggers CO inhibition but does not affect NCO or to a much lower extent. Because most CO likely arises from the ZMM pathway (CO interference pathway mainly relying on Zip1, 2, 3, 4, Spo16, Msh4, 5, and Mer3) in S. uvarum as in S. cerevisiae, and because the effect of sequence polymorphism is likely mediated by the MMR machinery, this would imply that MMR specifically inhibits the ZMM pathway at some point in S. uvarum. The weak effect or potential absence of the effect of sequence polymorphism on NCO formation suggests that heteroduplex DNA tracts, at least the way they form during NCO formation, escape the anti-recombinogenic effect of MMR in S. uvarum. A few comments about this could be added.

      We have added discussion and citations regarding the biased repair of DSB to NCO in introgression, lines 380-386.

      (8) The same applies to the fact that the CO number is lower in the natural cross compared to the fermentation cross, while the NCO number is the same. This suggests that under similar initiating Spo11-DSB numbers in both crosses, the decrease in CO is likely compensated by a similar increase in inter-sister recombination.

      Thank you to the reviewer for this observation. We agree that this could explain some differences between the crosses.

      (9) Introgressions represent only 10% of the genome, while the decrease in CO is at least 20%. This is a bit surprising especially in light of CO regulation mechanisms such as CO homeostasis that tends to keep CO constant. Could the authors comment on that?

      We interpret these results to reflect two underlying mechanisms. First, the presence of heterozygous introgression does reduce the number of COs. Second, we believe the difference in COs reflects variation in recombination rate between strains. We note that CO homeostasis need not apply across different genetic backgrounds. Indeed, recombination rate is appreciated to significantly differ between strains of S. cerevisiae (Raffoux et al. 2018), and recombination rate variation has been observed between strains/lines/populations in many different species including Drosophila, mice, humans, Arabidopsis, maize, etc. We reference S. cerevisiae strain variability in the Introduction lines 128-130, and have added context in the Results lines 180-187, and Discussion lines 343-350.

      (10) Finally, the frequency of NCOs in introgressed regions is about twice the frequency of CO in non-introgressed regions. Both CO and NCO result from Spo11-initiating DSBs.

      This suggests that more Spo11-DSBs are formed within introgressed regions and that such DSBs specifically give rise to NCO. Could this be related to the lack of homolog engagement which in turn shuts down Spo11-DSB formation as observed in ZMM mutants by the Keeney lab? Could this simply result from better detection of NCO in introgressed regions related to the increased marker density, although the authors claim that NCO counts are corrected for marker resolution?

      The effect noted by the reviewer remains despite the more conservative correction for marker density applied to NCO counts (as described in the response to Reviewer 1, comment #2). Given that CO+NCO counts in introgressed regions are not statistically different between crosses, it is likely that these regions are simply predisposed to a higher rate of DSBs than the rest of the genome. This is an interesting observation, however, and one that we would like to further explore in future work.

      (11) What could be the explanation for chromosome 12 to have more shuffling in the natural cross compared to the fermentation cross which is deprived of the introgressed region?

      We added this text to the Results, lines 323-327, "While it is unclear what potential mechanism is mediating the difference in shuffling on chromosome 12, we note that the rDNA locus on chromosome 12 is known to differ dramatically in repeat content across strains of S. cerevisiae (22–227 copies) (Sharma et a. 2022), and we speculate that differences in rDNA copy number between strains in our crosses could impact shuffling."

      Technical points:

      (12) In line 248, the authors removed NCO with fewer than three associated markers.

      What is the rationale for this? Is the genotyping strategy not reliable enough to consider events with only one or two markers? NCO events can be rather small and even escape detection due to low local marker density.

      We trust the genotyping strategy we used, but chose to be conservative in our detection of NCOs to account for potential sequencing biases.

      (13) Line 270: The way homology is calculated looks odd to this reviewer, especially the meaning of 0.5 homology. A site is either identical (1 homology) or not (0 homology).

      We've changed the language to better reflect what we are calculating (diploid sequence similarity; see comment #28). Essentially, the metric is a probability that two randomly selected chromatids--one from each parent--will share the same nucleotide at a given locus (akin to calculating the probability of homozygous offspring at a single locus). We average it along a segment of the genome to establish an expected sequence similarity if/when recombination occurs in that segment.

      (14) Line 365: beware that the estimates are for mitotic mismatch repair (MMR). Meiotic MMR may work differently.

      We removed the citation that refers exclusively to mitotic recombination. The statement regarding meiotic recombination is otherwise still reflective of results from Chen & Jinks-Robertson

      (15) Figure 1: there is no mention of potential 4:0 segregations. Did the authors find no such pattern? If not, how did they consider them?

      The program we used to call COs and NCOs (ReCombine's CrossOver program) can detect such patterns, but none were detected in our data.

      Reviewer #3 (Public Review):

      When members of two related but diverged species mate, the resulting hybrids can produce offspring where parts of one species' genome replace those of the other. These "introgressions" often create regions with a much greater density of sequence differences than are normally found between members of the same species. Previous studies have shown that increased sequence differences, when heterozygous, can reduce recombination during meiosis specifically in the region of increased difference. However, most of these studies have focused on crossover recombination, and have not measured noncrossovers. The current study uses a pair of Saccharomyces uvarum crosses: one between two natural isolates that, while exhibiting some divergence, do not contain introgressions; the other is between two fermentation strains that, when combined, are heterozygous for 9 large regions of introgression that have much greater divergence than the rest of the genome. The authors wished to determine if introgressions differently affected crossovers and noncrossovers, and, if so, what impact that would have on the gene shuffling that occurs during meiosis.

      (1) While both crossovers and noncrossovers were measured, assessing the true impact of increased heterology (inherent in heterozygous introgressions) is complicated by the fact that the increased marker density in heterozygous introgressions also increases the ability to detect noncrossovers. The authors used a relatively simple correction aimed at compensating for this difference, and based on that correction, conclude that, while as expected crossovers are decreased by increased sequence heterology, counter to expectations noncrossovers are substantially increased. They then show that, despite this, genetic shuffling overall is substantially reduced in regions of heterozygous introgression. However, it is likely that the correction used to compensate for the effect of increased sequence density is defective, and has not fully compensated for the ascertainment bias due to greater marker density. The simplest indication of this potential artifact is that, when crossover frequencies and "corrected" noncrossover frequencies are taken together, regions of introgression often appear to have greater levels of total recombination than flanking regions with much lower levels of heterology. This concern seriously undercuts virtually all of the novel conclusions of the study. Until this methodological concern is addressed, the work will not be a useful contribution to the field.

      We appreciate this concern. Please see response to comments #2 and #38. We further note that our results depicted in Figure 3 and 4 are not reliant on any correction or comparison with non-introgressed regions, and thus our results regarding sequence similarity and its effect on the repair of DSBs and the amount of genetic shuffling with/without introgression to be novel and important observations for the field.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Line 149 - this sentence refers to a mixture of papers reporting somatic or meiotic recombination and as these processes are based on different crossover pathways, this should not be mixed. For example, it is known that in Arabidopsis MSH2 has a pro-crossover function during meiotic recombination.

      Corrected

      (2) What is unclear to me is how the crosses are planned. Line 308 shows that there were only two crosses (one "natural" and one "fermentation"), but I understand that this is a shorthand and in fact several (four?) different strains were used for the "fermentation cross". At least that's what I concluded from Fig. 1B and its figure caption. This needs to be further explained. Were different strains used for each fermentation cross, or was one strain repeated in several crosses? In Figure 1, it would be worth showing, next to the panel showing "fermentation cross", a diagram of how "natural cross" was performed, because as I understand it, panel A illustrates the procedure common to both types of crosses, and not for "natural cross".

      We thank the reviewer for drawing our attention to confusion about how our crosses were created. We performed two crosses, as depicted in Figure 1A. The fermentation cross is a single cross from two strains isolated from fermentation environments. The natural cross is a single cross from two strains isolated from a tree and insect. Table S1 and the methods section "Strain and library construction" describe the strains used in more detail. We modified Figure 1 and the figure legend to help clarify this. See also response to comment #37.

      (3) The authors should provide a more detailed characterization of the genetic differences between chromosomes in their hybrids. What is the level of polymorphism along the S. uvarum chromosomes used in the experiments? Is this polymorphism evenly distributed? What are the differences in the level of polymorphism for individual introgressions? Theoretically, this data should be visible in Figure 2D, but this figure is practically illegible in the present form (see next comment).

      As suggested, we remade Figure 2D to only include chromosomes with an introgression present, and moved the remaining chromosomes to the supplements (Figure S11). The patterns of markers (which are fixed differences between the strains in the focal cross) should be more clear now. As we detail in the Methods line 507-508, we utilized a total of 24,574 markers for the natural cross and 74,619 markers for the fermentation cross (the higher number in the fermentation cross being due to more fixed differences in regions of introgression).

      (4) Figure 2D should be prepared more clearly, I would suggest stretching the chromosomes, otherwise, it is difficult to see what is happening in the introgression regions for CO and NCO (data for SNPs are more readable). Maybe leave only the chromosomes with introgressions and transfer the rest to the supplement?

      See previous comment.

      (5) How are the Y scales defined for Figure 2D?

      Figure 2D now includes units for the y-axis.

      (6) Are increases in CO levels in fermentation cross-observed at the border with introgressions? This would indicate local compensation for recombination loss in the introgressed regions, similar to that often observed for chromosomal inversions.

      We see no evidence of an increase in CO levels at the borders of introgressions, neither through visual inspection or by comparing the average CO rate in all fermentation windows to that of windows at the edges of introgressions. This is included in the Discussion lines 360-366, "While we are limited in our interpretations by only comparing two crosses (one cross with heterozygous introgression and one without introgression), these results are in line with findings in inversions, where heterozygotes show sharp decreases in COs, but the presence of NCOs in the inverted region (Crown et al., 2018; Korunes & Noor, 2019). However, unlike heterozygous inversions where an increase in COs is observed on freely recombining chromosomes (the inter-chromosomal effect), we do not see an increase in COs on the borders flanking introgression or on chromosomes without introgression."

      (7) Line 336 - "We find positive correlations between CO counts..." - you should indicate here that between fermentation and natural crosses, it was quite hard for me to understand what you calculated.

      We corrected the language as suggested.

      (8) The term "homology" usually means "having a common evolutionary origin" and does not specify the level of similarity between sequences, thus it cannot be measured. It is used incorrectly throughout the manuscript (also in the intro). I would use the term "similarity" to indicate the degree of similarity between two sequences.

      We corrected the language as suggested throughout the document.

      (9) Paragraph 360 and Figure 3 - was the "sliding window" overlapping or non-overlapping?

      We added clarifying language to the text in both places. We use a 101bp sliding window with 50bp overlaps.

      (10) Line 369 - what is "...the proportion of bases that are expected to match between the two parent strains..."?

      We clarified the language in this location, and hopefully changes associated with the comment about sequence similarity will make the comment even clearer in context.

      (11) Line 378 - should it refer to Figure S1 and not Figure 4?

      Corrected.

      (12) Line 399 - should refer to Figure 4, not Figure 5.

      Corrected

      (13) Line 444-449 - the analysis of loss of shuffling in the context of the location of introgression on the chromosome should be presented in the result section.

      We shifted the core of the analysis to the results, while leaving a brief summary in the discussion.

      (14) The authors should also take into account the presence of indels in their analyses, and they should be marked in the figures, if possible.

      We filtered out indels in our variant calling. However, we did analyze our crosses for the presence of large insertions and deletions (Table S2), which can obscure true recombination rates, and found that they were not an issue in our dataset.

      Reviewer #2 (Recommendations For The Authors):

      This reviewer suggests that the authors address the different points raised in the public review.

      (1) This reviewer would like to challenge the relevance of the r-parameter in light of chromosome 12 which has no introgression and still a strong depletion in r in the fermentation cross.

      We added this text to the Results, lines 377-381, "While it is unclear what potential mechanism is mediating the difference in shuffling on chromosome 12, we note that the rDNA locus on chromosome 12 is known to differ dramatically in repeat content across strains of S. cerevisiae (22–227 copies) (Sharma et a. 2022), and we speculate that differences in rDNA copy number between strains in our crosses could impact shuffling."

      (2) This reviewer insists on making sure that NCO detection is unaffected by the marker density, notably in the highly polymorphic regions, to unambiguously support Figure 1C.

      We've changed our correction for resolution to be more aggressive (see response to comment #2), and believe we have now adequately adjusted for marker density (see response to comment #38).

      Reviewer #3 (Recommendations For The Authors):

      I regret using such harsh language in the public review, but in my opinion, there has been a serious error in how marker densities are corrected for, and, since the manuscript is now public, it seems important to make it clear in public that I think that the conclusions of the paper are likely to be incorrect. I regret the distress that the public airing of this may cause. Below are my major concerns:

      (1) The paper is written in a way that makes it difficult to figure out just what the sequence differences are within the crosses. Part of this is, to be frank, the unusual way that the crosses were done, between more than one segregant each from two diploids in both natural and fermentation cases. I gather, from the homology calculations description, that each of these four diploids, while largely homozygous, contained a substantial number of heterozygosities, so individual diploids had different patterns of heterology. Is this correct? And if so, why was this strategy chosen? Why not start with a single diploid where all of the heterologies are known? Why choose to insert this additional complication into the mix? It seems to me that this strategy might have the perverse effect of having the heterology due to the polymorphisms present in one diploid affect (by correction) the impact of a noncrossover that occurs in a diploid that lacks the additional heterology. If polymorphic markers are a small fraction of total markers, then this isn't such a great concern, but I could not find the information anywhere in the manuscript. As a courtesy to the reader, please consider providing at the beginning some basic details about the starting strains-what is the average level of heterology between natural A and natural B, and what fraction of markers are polymorphic; what is the average level of heterology between fermentation A and fermentation B in non-introgressed regions, in introgressed regions, and what fraction of markers are polymorphic? How do these levels of heterology compare to what has been examined before in whole-genome hybrid strains? It also might be worth looking at some of the old literature describing S. cerevisiae/S. carlsbergensis hybrids.

      We thank the reviewer for drawing our attention to confusion about the cross construction. These crosses were conducted as is typical for yeast genetic crosses: we crossed 2 genetically distinct haploid parents to create a heterozygous diploid, then collected the haploid products of meiosis from the same F1 diploid. Because the crosses were made with haploid parents, it is not possible for other genetic differences to be segregating in the crosses. We have revised Figure 1 and its caption to clarify this. Further details regarding the crosses are in the Methods section "Strain and library construction" and in Supplemental Table S1. We only utilized genetic markers that are fixed differences between our parental strains to call CO and NCO. As we detail in the Methods line 507-508, we utilized a total of 24,574 markers for the natural cross and 74,619 markers for the fermentation cross (the higher number in the fermentation cross being due to more fixed differences in regions of introgression). We additionally revised Figure 2D (and Figure S11) to help readers better visualize differences between the crosses.

      (2) There are serious concerns about the methods used to identify noncrossovers and to normalize their levels, which are probably resulting in an artifactually high level of calculated crossovers in Figure 2. As a primary indication of this, it appears in Figure 2 that the total frequency of events (crossovers + noncrossovers) in heterozygous introgressed regions are substantially greater than those in the same region in non-introgressed strains, while just shifting of crossovers to noncrossovers would result in no net increase. The simplest explanation for this is that noncrossovers are being undercounted in non-introgressed relative to introgressed heterozygous regions. There are two possible reasons for this: i. The exclusion of all noncrossover events spanning less than three markers means that many more noncrossovers in introgressed heterozygous regions than in non-introgressed. Assuming that average non-homology is 5% in the former and 1% in the latter, the average 3-marker event will be 60 nt in introgressed regions and 300 nt in non-introgressed regions - so many more noncrossovers will be counted in introgressed regions. A way to check on this - look at the number of crossover-associated markers that undergo gene conversion; use the fraction that involves < 3 markers to adjust noncrossover levels (this is the strategy used by Mancera et al.). ii. The distance used for noncrossover level adjustment (2kb) is considerably greater than the measured average noncrossover lengths in other studies. The effect of using a too-long distance is to differentially under-correct for noncrossovers in non-introgressed regions, while virtually all noncrossovers in heterozygous introgressed regions will be detected. This can be illustrated by simulations that reduce the density of scored markers in heterozygous introgressed regions to the density seen in non-introgressed regions. Because these concerns go to the heart of the conclusions of the paper, they must be addressed quantitatively - if not, the main conclusions of the paper are invalid.

      We adjusted the correction factor (See also response to comment #2) and compared the average number of CO and NCO events in introgressed and non-introgressed regions between crosses (two comparisons: introgression CO+NCO in natural cross vs introgression CO+NCO in fermentation cross; non-introgression CO+NCO in natural cross vs non-introgression CO+NCO in fermentation cross). We found no significant differences between the crosses in either of the comparisons. This indicates that the distribution of total events is replicated in both crosses once we correct for resolution.

      (3) It is important to distinguish the landscape of double-strand breaks from the landscape of recombination frequencies. Double-strand breaks, as measured by uncalibrated levels of Spo11-linked oligos, is a relative number - not an absolute frequency. So it is possible that two species could have a similar break landscape in terms of topography but have absolute levels higher in one species than in the other.

      We agree with this statement, however, we have removed the relevant text to streamline our introduction.

      (4) Lines 123-125. Just meiosis will produce mosaic genomes in the progeny of the F1; further backcrossing will reduce mosaicism to the level of isolated regions of introgression.

      Adjusted the language to be more specific.

      (5) Please provide actual units for the Y axes in Figure 2D.

      We have corrected the units on the axes.

      (6) Tables (general). Are the significance measures corrected for multiple comparisons?

      In Table 3, the cutoff was chosen to be more conservative than a Bonferroni corrected alpha=0.01 with 9 comparisons (0.0011). In text, any result referred to as significant has an associated hypothesis test with a p-value less than its corresponding Bonferroni-corrected alpha of 0.05. This has been clarified in the caption for Table 3 and in the text where relevant.

    1. “Order maintenance policing,” a type of proactive policing, is informed by the ‘broken windows’ theory — the idea that by fighting smaller crimes, it’s possible to create a ‘lawful’ environment that helps deter the more serious crimes.

      Yet, it is also known that in middle-class areas that are less policed, there is less crime reported.

    1. It has no windows, and the door swings,

      The modern world that Eliot describes seems to be defined by extremes. In this passage, however, the readers get a sense that it is an ambiguous, uncertain state in between that poses the greatest danger.

      We are first introduced to a dystopian vision of a chapel, conventionally a symbol of spirituality. In Eliot’s version, however, this chapel becomes a representation of destructive ambiguity. It has “no windows” and its door “is swinging.” The absence of windows invokes a feeling of isolation from an external world. What should serve as a connection to a higher power, thus, does not simply become a barren space that obstructs any chance of spiritual communion but also inhibits our connection to the terrestrial realm. The “swinging” door further amplifies this uncertainty: is it an exit to the chaotic world outside or an entrance to a questionable hope within the chapel? In both cases, this “swinging” further exemplifies this ambiguity: neither open nor closed, this door can slam shut at any moment, trapping us in a cycle of isolation and ambiguity.

      Eliot continues this notion with “only a cock” standing on a rooftree and crying “co co rico co co rico.” Cocks traditionally exclaim in the morning with a sunrise, representing a new day: a new beginning. Interestingly, the word “cock” also represents a firing lever in a gun, which, when raised, makes the gun ready for firing an attack. The cock is physically elevated, standing on a “rooftree” and following its cry another sound emerges – lightning, analogous to the sound of a bullet fired. With the cock raised and the shot fired, the new beginning transforms into a signal of an impending end. The ambiguous nature of this new beginning hints at a cycle of violence and chaos beneath these new beginnings, emphasizing the danger of embracing transitional moments.

      The use of “co co rico co co rico,” which is in most Eastern European countries is a sound that a cock makes, opposed to a conventionally english version “cock-a-doodle-doo” invokes a reference to Hesse’s analysis of a Russian man within The Karamazov Brothers, which Eliot noted in his notes for the Wasteland. Hesse describes precisely the danger of this ambiguity, stating the men are “halfway between good and evil.” Men are positioned in the middle between the animals and a higher consciousness: neither one or the other, forcing us to attempt to repress one and let the other dominate. This animal consciousness, however, can never become fully suppressed. Instead, “everyone of these [dangerous] instincts must come sooner or later to the surface. Each instinct goes on living, not one is killed…” It is the in-between state, the coexistence of both natures and the resulting suppressing of one that leads to human violence guided by the previously suppressed instincts.

      Thus, within Eliot’s portrayal of the empty chapel and the ambiguous imagery of a cock, the in-between state is not merely a transitional phase; instead, it becomes a breeding ground for despair and chaos.

    1. and it wasthere the ravens liked to perch. The tree was full of them, and therewere more in the arched windows overhead, all

      just like raventree hall

    Annotators

    1. In rodent models, exposure to phthalates and BPA can induce alterations in DNA methylation patterns (70–73), histone modifications (73, 74), and non-coding RNA expression within the germline. These changes can disrupt normal epigenetic programming during critical windows of spermatogenesis, leading to impaired sperm development, reduced sperm quality, and compromised fertility

      Phthalates and BPA were other chemicals I want to research.

    1. Hey there everybody, thanks for joining.

      It's great to have you with me in this lesson where we're going to talk about why cloud matters.

      Now to help answer that question, what I want to do firstly is talk to you about the traditional IT infrastructure.

      How did we used to do things?

      What sort of challenges and issues did we face?

      And therefore we'll get a better understanding of what cloud is actually doing to help.

      We can look at how things used to be and how things are now.

      So what we're going to do throughout this lesson is walk through a little bit of a scenario with a fictitious company called Ozzymart.

      So let's go ahead now, jump in and have a chat about the issues that they're currently facing.

      Ozzymart is a fictitious company that works across the globe selling a range of different Australia related paraphernalia.

      Maybe stuffed toys for kangaroos, koalas and that sort of thing.

      Now they've currently got several different applications that they use that they provide access to for their users.

      And currently the Ozzymart team do not use the cloud.

      So when we have a look at the infrastructure hosting these applications, we'll learn that Ozzymart have a couple of servers, one server for each of the applications that they've got configured.

      Now the Ozzymart IT team have had to have gone and set up these servers with windows, the applications and all the different data that they need for these applications to work.

      And what it's also important to understand about the Ozzymart infrastructure is all of this is currently hosted on their on-premises customer managed infrastructure.

      So yes, the Ozzymart team could have gone out and maybe used a data center provider.

      But the key point here is that the Ozzymart IT team have had to set up servers, operating systems, applications and a range of other infrastructure to support all of this storage, networking, power, cooling.

      Okay, these are the sorts of things that we have to manage traditionally before we were able to use cloud.

      Now to help understand what sort of challenges that might introduce, let's walk through a scenario.

      We're going to say that the Ozzymart CEO has gone and identified the need for reporting to be performed across these two applications.

      And the CEO wants those reports to be up and ready by the end of this month.

      Let's say that's only a week away.

      So the CEO has instructed the finance manager and the finance manager has said, "Hey, awesome.

      You know what?

      I've found this great app out there on the internet called Reports For You.

      We can buy it, download it and install it.

      I'm going to go tell the IT team to get this up and running straight away."

      So this might sound a little bit familiar to some of you who have worked in traditional IT where sometimes demands can come from the top of the organization and they filter down with really tight timelines.

      So let's say for example, the finance manager is going to go along, talk to the IT team and say, "We need this Reports For You application set up by the end of month."

      Now the IT team might be a little bit scared because, hey, when we look at the infrastructure we've got, it's supporting those two servers and applications okay, but maybe we don't have much more space.

      Maybe we don't have enough storage.

      Maybe we are using something like virtualization.

      So we might not need to buy a brand new physical server and we can run up a virtual Windows server for the Reports For You application.

      But there might just not be enough resources in general.

      CPU, memory, storage, whatever it might be to be able to meet the demands of this Reports For You application.

      But you've got a timeline.

      So you go ahead, you get that server up and running.

      You install the applications, the operating system data, all there as quickly as you can to meet these timelines that you've been given by the finance manager.

      Now maybe it's not the best server that you've ever built.

      It might be a little bit rushed and a little bit squished, but you've managed to get that server up and running with the Reports For You application and you've been able to meet those timelines and provide access to your users.

      Now let's say that you've given access to your users for this Reports For You application.

      Now let's say when they start that monthly reporting job, the Reports For You application needs to talk to the data across your other two applications, the Aussie Mart Store and the Aussie Mart Comply application.

      And it's going to use that data to perform the reporting that the CEO has requested.

      So you kick off this report job on a Friday.

      You hope that it's going to be complete on a Saturday, but maybe it's not.

      You check again on a Sunday and things are starting to get a little bit scary.

      And uh-oh, Monday rolls around, the Reports For You report is still running.

      It has not yet complete.

      And that might not be so great because you don't have a lot of resources on-premises.

      And now all of your applications are starting to perform really poorly.

      So that Reports For You application is still running.

      It's still trying to read data from those other two applications.

      And maybe they're getting really, really slow and let's hope not, but maybe the applications even go off entirely.

      Now those users are going to become pretty angry.

      You're going to get a lot of calls to the help desk saying that things are offline.

      And you're probably going to have the finance manager and every other manager reaching out to you saying, this needs to be fixed now.

      So let's say you managed to push through, perhaps through the rest of Monday, and that report finally finishes.

      You clearly need more resources to be able to run this report much more quickly at the end of each month so that you don't have angry users.

      So what are you going to do to fix this for the next month when you need to run the report again?

      Well, you might have a think about ordering some new software and hardware because you clearly don't have enough hardware on-premises right now.

      You're going to have to wait some time for all of that to be delivered.

      And then you're going to have to physically and store it, set it up, get it running, and make sure that you've got everything you need for reports for you to be running with more CPU and resources next time.

      There's a lot of different work that you need to do.

      This is one of the traditional IT challenges that we might face when the business has demands and expectations for things to happen quickly.

      And it's not really necessarily the CEO or the finance manager's fault.

      They are focused on what the business needs.

      And when you work in the technology teams, you need to do what you can to support them so that the business can succeed.

      So how might we do that a little bit differently with cloud?

      Well, with cloud, we could sign up for a cloud provider, we could turn on and off servers as needed, and we could scale up and scale down, scale in and scale out resources, all to meet those demands on a monthly basis.

      So that could be a lot less work to do and it could certainly provide you the ability to respond much more quickly to the demands that come from the business.

      And rather than having to go out and buy all of this new infrastructure that you are only going to use once a month, well, as we're going to learn throughout this course, one of the many benefits of cloud is that you can turn things on and off really quickly and only pay for what you need.

      So what might this look like with cloud?

      Well, with cloud, what we might do is no longer have that on-premises rushed server that we were using for reports for you.

      Instead of that, we can go out to a public cloud provider like AWS, GCP or hopefully Azure, and you can set up those servers once again using a range of different features, products that are all available through the various public cloud providers.

      Now, yes, in this scenario, we are still talking about setting up a server.

      So that is going to take you some time to configure Windows, set up the application, all of the data and configuration that you require, but at least you don't need to worry about the actual physical infrastructure that is supporting that server.

      You don't have to go out, talk to your procurement team, talk to a different providers, wait for different physical infrastructure to be delivered and licensing and software and other assets.

      With cloud, as we will learn, you can really quickly get online and up and running.

      And also, if we had that need to ensure that the reports for you application was running with lots of different resources at the end of the month, it's much easier when we use cloud to just go and turn some servers on and then maybe turn them off at the end of the month when they are no longer acquired.

      This is the sort of thing that we are talking about with cloud.

      We're only really just touching on the service about what cloud can do and what cloud actually is.

      But my hope is that through this lesson, you can understand how cloud changes things.

      Cloud allows us to work with technology in a much different way than we traditionally would work with our on-premises infrastructure.

      Another example that shows how cloud is different is that rather than using the reports for you application, what we might in fact actually choose to do is go to a public cloud provider and go to someone that actually has a equivalent reports for you solution that's entirely built in the cloud ready to go.

      In this way, not only do we no longer have to manage the underlying physical infrastructure, we don't actually have to manage the application software installation, configuration, and all of that service setup.

      With something like a reporting software that's built in the cloud, we would just provide access to our users and only have to pay on a per user basis.

      So if you've used something like zoom for meetings or Dropbox for data sharing, that's the sort of solution we're talking about.

      So if we consider this scenario for Aussie Mart, we have a think about the benefits that they might access when they use the cloud.

      Well, we can much more quickly get access to resources to respond to demand.

      If we need to have a lot of different compute capacity working at the end of the month with cloud, like you'll learn, we can easily get access to that.

      If we wanted to add lots of users, we could do that much more simply as well.

      And something that the finance manager might really be happy about in this scenario is that we aren't going to go back and suggest to them that we need to buy a whole heap of new physical infrastructure right now.

      When we think about traditionally how Aussie Mart would have worked with this scenario, they would have to go and buy some new physical servers, resources, storage, networking, whatever that might be, to meet the needs of this reports for you application.

      And really, they're probably going to have to strike a balance between having enough infrastructure to ensure that the reports for you application completes its job quickly and not buying too much infrastructure that's just going to be sitting there unused whilst the reports for you application is not working.

      And really importantly, when we go to cloud, we see this difference as not having to buy lots of physical infrastructure upfront as being referred to as capital expenditure versus operational expenditure.

      Really, what we're just saying here is rather than spending a whole big lump sum all at once to get what you need, you can just pay on a monthly basis for what you need when you need it.

      And finally, one of the other benefits that you'll also see is that we're getting a reduction in the amount of different tasks that we have to complete in terms of IT administration, set up of operating systems, management of physical infrastructure, what the procurement team has to manage, and so on.

      Again, right now we're just talking really high level about a fictitious scenario for Aussie Mart to help you to understand the types of things and the types of benefits that we can get access to for cloud.

      So hopefully if you're embarking on a cloud journey, you're gonna have a happy finance manager, CEO, and other team members that you're working with as well.

      Okay, everybody, so that's a wrap to this lesson on why cloud matters.

      As I've said, we're really only just scratching the surface.

      This is just to introduce you to a scenario that can help you to understand the types of benefits we get access to with cloud.

      As we move throughout this course, we'll progressively dive deeper in terms of what cloud is, how you define it, the features you get access to, and other common concepts and terms.

      So thanks for joining me, I'll see you there.

    1. Welcome back and in this brief demo lesson I want to give you some experience of working with both EC2 instance connect as well as connecting with a local SSH client.

      Now these are both methods which are used for connecting to EC2 instances both with public IP version 4 addressing and IP version 6 addressing.

      Now to get started we're going to need some infrastructure so make sure that you're logged in as the IAM admin user into the general AWS account which is the management account of the organization and as always you'll need the northern Virginia region selected.

      Now in this demonstration you are going to be connecting to an EC2 instance using both instance connect and a local SSH client and to use a local SSH client you need a key pair.

      So to create that let's move across to the EC2 console, scroll down on the left and select key pairs.

      Now you might already have key pairs created from earlier in the course.

      If you have one created which is called A4L which stands for Animals for Life then that's fine.

      If you don't we're going to go ahead and create that one.

      So click on create key pair and then under name we're going to use A4L.

      Now if you're using Windows 10 or Mac OS or Linux then you can select the PEM file format.

      If you're using Windows 8 or prior then you might need to use the putty application and to do that you need to select PPK.

      But for this demonstration I'm going to assume that you're using the PEM format.

      So again this is valid on Linux, Mac OS or any recent versions of Microsoft Windows.

      So select PEM and then click on create key pair and when you do it's going to present you with a download.

      It's going to want you to save this key pair to your local machine so go ahead and do that.

      Once you've done that from the AWS console attached to this lesson is a one-click deployment link.

      So I want you to go ahead and click that link.

      That's going to move you to a quick create stack screen.

      Everything should be pre-populated.

      The stack name should be EC2 instance connect versus SSH.

      The key name box should already be pre-populated with A4L which is a key that you just created or one which you already had.

      Just move down to the very bottom, check the capabilities box and then click on create stack.

      Now you're going to need this to be in a create complete state before you continue with the demo lesson.

      So pause the video, wait for your stack to change to create complete and then you're good to continue.

      Okay so this stacks now in a create complete status and we're good to continue.

      Now if we click on the resources tab you'll see that this has created the standard animals for life VPC and then it's also created a public EC2 instance.

      So this is an EC2 instance with a public IP version 4 address that we can use to connect to.

      So that's what we're going to do.

      So click on services and then select EC2 to move to the EC2 console.

      Once you're there click on instances running and you should have a single EC2 instance A4L-publicEC2.

      Now the two different ways which I want to demonstrate connecting to this instance in this demo lesson are using a local SSH client and key based authentication and then using the EC2 instance connect method.

      And I want to show you how those differ and give you a few hints and tips which might come in useful for production usage and for the exams.

      So if we just go ahead and select this instance and then click on the security tab you'll see that we have this single security group which is associated to this instance.

      Now make sure the inbound rules is expanded and just have a look at what network traffic is allowed by this security group.

      So the first line allows port 80 TCP which is HTTP and it allows that to connect to the instance from any source IP address specifically IP version 4.

      We can tell it's IP version 4 because it's 0.0.0.0/0 which represents any IP version 4 address.

      Next we allow port 22 using TCP and again using the IP version 4 any IP match and this is the entry which allows SSH to connect into this instance using IP version 4.

      And then lastly we have a corresponding line which allows SSH using IP version 6.

      So we're allowing any IP address to connect using SSH to this EC2 instance.

      And so connecting to it using SSH is relatively simple.

      We can right click on this instance and select connect and then choose SSH client and AWS provides us with all of the relevant information.

      Now note how under step number three we have this line which is chmod space 400 space a4l.pm.

      I want to demonstrate what happens if we attempt to connect without changing the permissions on this key file.

      So to do that right at the bottom is an example command to connect to this instance.

      So just copy that into your clipboard.

      Then I want you to move to your command prompt or terminal.

      In my case I'm running macOS so I'm using a terminal application.

      Then you'll need to move to the folder where you have the PEM file stored or where you just downloaded it in one of the previous steps.

      I'm going to paste in that command which I just copied onto my clipboard.

      This is going to use the a4l.pm file as the identity information and then it's going to connect to the instance using the EC2-user local Linux user.

      And this is the host name that it's going to connect to.

      So this is my EC2 instance.

      Now I'm going to press enter and attempt that connection.

      First it will ask me to verify the authenticity of this server.

      So this is an added security method.

      This is getting the fingerprint of this EC2 instance.

      And it means that if we independently have a copy of this fingerprint, say from the administrator of the server that we're connecting to, then we can verify that we're connecting to that same server.

      Because it's possible that somebody could exploit DNS and replace a legitimate DNS name with one which points at a non-legitimate server.

      So that's important.

      You can't always rely on a DNS name.

      DNS names can be adjusted to point at different IP addresses.

      So this fingerprint is a method that you can use to verify that you're actually connecting to the machine or the instance which you think you are.

      Now in this case, because we've just created this EC2 instance, we can be relatively certain that it is valid.

      So we're just going to go ahead and type yes and press enter.

      And then it will try to connect to this instance.

      Now immediately in my case, I got an error.

      And this error is going to be similar if you're using macOS or Linux.

      If you're using Windows, then there is a chance that you will get this error or won't.

      And if you do get it, it might look slightly different.

      But look for the keyword of permissions.

      If you see that you have a permissions problem with your key, then that's the same error as I'm showing on my screen now.

      Basically what this means is that the SSH client likes it when the permissions on these keys are restricted, restricted to only the user that they belong to.

      Now in my case, the permissions on this file are 644.

      And this represents my user, my group, and then everybody.

      So this means this key is accessible to other users on my local system.

      And that's far too open to be safe when using local SSH.

      Now in Windows, you might have a similar situation where other users of your local machine have read permissions on this file.

      What this error is telling us to do is to correct those permissions.

      So if we go back to the AWS console, this is the command that we need to run to correct those permissions.

      So copy that into your clipboard, move back to your terminal, paste that in, and press enter.

      And that will correct those permissions.

      Now under Windows, the process is that you need to edit the permissions of that file.

      So right click properties and then edit the security.

      And you need to remove any user access to that file other than your local user.

      And that's the same process that we've just done here, only in Windows it's GUI based.

      And under Mac OS or Linux, you use CHmod.

      So now that we've adjusted those permissions, if I use the up arrow to go back to the previous command and press enter, I'm able to connect to the CC2 instance.

      And that's using the SSH client.

      To use the SSH client, you need to have network connectivity to the CC2 instance.

      And you need to have a valid SSH key pair.

      So you need the key stored on your local machine.

      Now this can present scalability issues because if you need to have a large team having access to this instance, then everybody in that team need a copy of this key.

      And so that does present admin problems if you're doing it at scale.

      Now in addition to this, because you're connecting using an SSH client from your local machine, you need to make sure that the security group of this instance allows connections from your local machines.

      So in this case, it allows connections from any source IP address into this instance.

      And so that's valid for my IP address.

      You need to make sure that the security group on whichever instance you're attempting to connect to allows your IP address as a minimum.

      Now another method that you can use to connect to EC2 is EC2 instance connect.

      Now to use that, we right click, we select connect, and we have a number of options at the top.

      One of these is the SSH client that we've just used.

      Another one is EC2 instance connect.

      So if we select this option, we're able to connect to this instance.

      It shows us the instance ID, it shows us the public IP address, and it shows us the user to connect into this instance with.

      Now AWS attempt to automatically determine the correct user to use.

      So when you launch an instance using one of the default AMIs, then it tends to pick correctly.

      However, if you generate your own custom AMI, it often doesn't guess correctly.

      And so you need to make sure that you're using the correct username when connecting using this method.

      But once you've got the correct username, you can just go ahead and click on connect, and then it will open a connection to that instance using your web browser.

      It'll take a few moments to connect, but once it has connected, you'll be placed at the terminal of this EC2 instance in exactly the same way as you were when using your local SSH.

      Now one difference you might have noticed is at no point where you prompted to provide a key.

      When you're using EC2 instance connect, you're using AWS permissions to connect into this instance.

      So because we're logged in using an admin user, we have those permissions, but you do need relevant permissions added to the identity of whoever is using instance connect to be able to connect into the instance.

      So this is managed using identity policies on the user, the group or the role, which is attempting to access this instance.

      Now one important element of this, which I want to demonstrate, if we go back to instances and we select the instance, click on security, and then click on the security group, which is associated with this instance.

      Scroll down, click on edit inbound rules, and then I want you to locate the inbound rule for IP version 4 SSH, SSH TCP 22, and then it's using this catchall, so 0.0.0.0/0, which represents any IP version 4 address.

      So go ahead and click on the cross to remove that, and then on that same line in the source area, click on this drop down and change it to my IP.

      So this is my IP address, yours will be different, but then we're going to go ahead and save that rule.

      Now just close down the tab that you've got connected to instance connect, move back to the terminal, and type exit to disconnect from that instance, and then just rerun the previous command.

      So connect back to that instance using your local SSH client.

      You'll find that it does reconnect because logically enough, this connection is coming from your local IP address, and you've changed the security group to allow connections from that address, so it makes sense that this connection still works.

      Moving back to the console though, let's go to the EC2 dashboard, go to running instances, right click on this instance, go to connect, select EC2 instance connect, and then click on connect and just observe what happens.

      Now you might have spent a few minutes waiting for this to connect, and you'll note that it doesn't connect.

      Now this might seem strange at this point because you're connecting from a web browser, which is running on your local machine.

      So it makes sense that if you can connect from your local SSH client, which is also running on your local machine, you should be able to connect using EC2 instance connect.

      Now this might seem logical, but the crucial thing about EC2 instance connect is that it's not actually originating connections from your local machine.

      What's happening is that you're making a connection through to AWS, and then once your connection arrives at AWS, the EC2 instance connect service is then connecting to the EC2 instance.

      Now what you've just done is you've edited the security group of this instance to only allow your local IP address to connect, and this means that the EC2 instance connect service can no longer connect to this instance.

      So what you need in order to allow the EC2 instance connect service to work is you either need to allow every source IP address, so 0.0.0.0.0/0, but of course that's bad practice for production usage.

      It's much more secure if you go to this URL, and I'll make sure that I include this attached to this lesson.

      This is a list of all of the different IP ranges which AWS use for their services.

      Now because I have this open in Firefox, it might look a little bit different.

      If I just go to raw data, that might look the same as your browser.

      If you're using Firefox, you have the ability to open this as a JSON document.

      Both of them show the same data, but when it's JSON, you have the ability to collapse these individual components.

      But the main point about this document is that this contains a list of all of the different IP addresses which are used in each different region for each different service.

      So if we wanted to allow EC2 instance connect for a particular region, then we might search for instance, locate any of these items which have EC2 instance connect as the service, and then just move through them looking for the one which matches the region that we're using.

      Now in my case, I'm using US East One, so I'd scroll through all of these IP address ranges looking for US East One.

      There we go, I've located it.

      It's using this IP address range.

      So I might copy this into my clipboard, move back to the EC2 console, select the instance, click on security, select the security group of this instance, scroll down, edit the inbound rules, remove the entry for my IP address, paste in the entry for the EC2 instance connect service, and then save that rule.

      And now what you'll find if you move back to your terminal and try to interact with this instance, you might be able to initially because the connection is still established, but if you exit and then attempt to reconnect, this time you'll see that you won't be able to connect because now your local IP address is no longer allowed to connect to this instance.

      However, if you move back to the AWS console, go to the dashboard and then instance is running, right click on the instance and put connect, select instance connect and then click on connect.

      Now you'll be allowed to connect using EC2 instance connect.

      And the reason for that just to reiterate is that you've just edited the security group of this EC2 instance and you've allowed the IP address range of the EC2 instance connect service.

      So now you can connect to this instance and you could do so at scale using AWS permissions.

      So I just wanted to demonstrate how both of those connection methods work, both instance connect and using a local SSH client.

      That's everything I wanted to cover.

      So just go ahead and move back to the CloudFormation console, select this stack that you created using the one click deployment, click on delete and then confirm that process.

      And that will clear up all of the infrastructure that you've used in this demo lesson.

      At this point though, that's everything I wanted to cover.

      So go ahead, complete this video and when you're ready, I'll look forward to you joining me in the next.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weaknesses:

      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

      Thank you for these insightful suggestions. In the revised Manuscript, we have added a number of additional analyses that provide a quantitative comparison between the classical and fractal cycle approaches aiming to identify the source of the discrepancies between classical and fractal cycle durations. Likewise, we assessed the intra-fractal and intra-classical method reliability as outlined below.

      Reviewer #1 (Recommendations For The Authors):

      One of the challenges in interpreting the results of the manuscript is understanding whether the differences between the two methods are due to a genuine difference in what these two methods are quantifying or simply noise/variability in each method. If the authors could provide some more insight into this, it would be a great help in assessing their findings and I think bolster the applicability of their method.

      (1) Method reliability: The manuscript clearly shows that cycle length is robustly correlated between fractal and classical in multiple datasets, however, it is hard to assign a meaningful interpretation to the correlation value (ie R = 0.5) without some reference point. This could be provided by looking at the intra-method correlation of cycle lengths. In the case of classical scoring, inter-scorer results could be compared, if the R-value here is significantly higher than 0.5 it would suggest genuine differences between the methods. In the case of fractal scoring, inter-electrode results could be compared / results with slight changes to the peak prominence threshold or smoothing window.

      In the revised Manuscript, we performed the following analyses to show the intra-method reliability:

      a) Classical cycle reliability: For the revised Manuscript, an additional scorer has independently defined classical sleep cycles for all datasets and marked sleep cycles with skipped REM sleep. Likewise, we have performed automatic sleep cycle detection using the R “SleepCycles” package by Blume & Cajochen (2021). We have added a new Table S8 to Supplementary Material 2 that shows the averaged cycle durations and cycle numbers obtained by the two human scorers and automatic algorithm as well as the inter-scorer rate agreement. We have added a new sheet named “Classical method reliability” that reports classical cycle durations for each participant and each dataset as defined by two human scorers and the algorithm To the Supplementary Excel file.

      We found that the correlation coefficients between two human scorers ranged from 0.69 to 0.91 (in literature, r’s > 0.7 are defined as strong scores) in different datasets, thus being higher than correlation coefficients between fractal and classical cycle durations, which in turn ranged from 0.41 to 0.55 (r’s in the range of 0.3 – 0.7 are considered moderate scores). The correlation coefficients between human raters and the automatic algorithm showed remarkably lower coefficients ranging from 0.30 to 0.69 (moderate scores) in different datasets, thus lying within the range of the correlation coefficients between fractal and classical cycle durations. This analysis is reported in Supplementary Material 2, section ”Intra-classical method reliability” and Table S8.

      b) Fractal cycle reliability: In the revised Supplementary Material 2 of our Manuscript, we assessed the intra-fractal method reliability, we correlated between the durations of fractal cycles calculated as defined in the main text, i.e., using a minimum peak prominence of 0.94 z and smoothing window of 101 thirty-second epochs, with those calculated using a minimum peak prominence ranging from 0.86 to 1.20 z with a step size of 0.04 z and smoothing windows ranging from 81 to 121 thirty-second epochs with a step size of 10 epochs (Table S7). We found that fractal cycle durations calculated using adjacent minimum peak prominence (i.e., those that differed by 0.04 z) showed r’s > 0.92, while those calculated using adjacent smoothing windows (i.e., those that differed by 10 epochs) showed r’s > 0.84. In addition, we correlated fractal cycle durations defined using different channels and found that the correlation coefficients ranged between 0.66 – 0.67 (Table S1). Thus, most of the correlations performed to assess intra-fractal method reliability showed correlation coefficients (r > 0.6) higher than those obtained to assess inter-method reliability (r = 0.41 – 0.55), i.e., correlations between fractal and classical cycle. This analysis is reported in Supplementary Material 2, section ”Intra-fractal method reliability” and Table S7. Likewise, we have added a new sheet named “Fractal method reliability” that reports the actual values for the abovementioned parameters to the Supplementary Excel file. For a discussion on potential sources of differences, see below.

      (2) Origin of method differences: The authors outline a few possible sources of discrepancies between the two methods (peak vs REM end, skipped REM cycle detection...) but do not quantify these contributions. It would be interesting to identify some factors that could predict for either a given night of sleep or dataset whether it is likely to show a strong or weak agreement between methods. This could be achieved by correlating measures of the proposed differences ("peak flatness", fractal cycle depth, or proportion of skipped REM cycles) with the mismatch between the two methods.

      In the revised Manuscript, we have quantified a few possible sources of discrepancies between the durations of fractal vs classical cycles and added a new section named “Sources of fractal and classical cycle mismatches” to the Results as well as new Tables 5 and S10 (Supplementary Material 2). Namely, we correlated the difference in classical vs fractal sleep cycle durations on the one side, and either the amplitude of fractal descent/ascent (to reflect fractal cycle depth), duration of cycles with skipped REM sleep/TST, duration of wake after sleep onset/TST or the REM episode length of a given cycle (to reflect peak flatness) on the other side. We found that a higher difference in classical vs fractal cycle duration was associated with a higher proportion of wake after sleep onset (r = 0.226, p = 0.001), shallower fractal descents (r = 0.15, p = 0.002) and longer REM episodes (r = 0.358, p < 0.001, n = 417 cycles, Table S10 in Supplementary Material 2). The rest of the assessed parameters showed no significant correlations (Table S10). We have added a new sheet named “Fractal-classical mismatch” that reports the actual values for the abovementioned parameters to the Supplementary Excel file.  

      (3) Skipped REM cycles: the authors underline that the fractal method identified skipped REM cycles. It seems likely that manual identification of skipped REM cycles is particularly challenging (ie we would expect this to be a particular source of error between two human scorers). If this is indeed the case, it would be interesting to discuss, since it would highlight an advantage of their methodology that they already point out (l644).

      In the revised Manuscript, we have added the inter-scorer rate agreement regarding cycles with skipped REM sleep, which was equal to 61%, which is 32% lower than the performance of our fractal cycle algorithm (93%). These findings are now reported in the “Skipped cycles” section of the Results and in Table S9 of Supplementary Material 2. We also discuss them in Discussion:

      “Our algorithm detected skipped cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower. We deduce that the fractal cycle algorithm detected skipped cycles since a lightening of sleep that replaces a REM episode in skipped cycles is often expressed as a local peak in fractal time series.”<br /> Discussion, section “Fractal and classical cycles comparison”, paragraph 5.

      Minor comments:

      - In the subjects where the number of fractal and classical cycles did not match, how large was the difference (ie just one extra cycle or more)? Correlating cycle numbers could be one way to quantify this.

      In the revised Manuscript, we have reported the required information for the participants with no one-to-one match (46% of all participants) as follows: 

      “In the remaining 46% of the participants, the difference between the fractal and classical cycle numbers ranged from -2 to 2 with the average of -0.23 ± 1.23 cycle. This subgroup had 4.6 ± 1.2 fractal cycles per participant, while the number of classical cycles was 4.9 ± 0.7 cycles per participant. The correlation coefficient between the fractal and classical cycle numbers was 0.280 (p = 0.006) and between the cycle durations – 0.278 (p=0.006).” Results, section “Correspondence between fractal and classical cycles”, last paragraph.

      - When discussing the skipped REM cycles (l467), the authors explain: "For simplicity and between-subject consistency, we included in the analysis only the first cycles". I'm not sure I understood this, could they clarify to which analysis they are referring to?

      In the revised Manuscript, we performed this analysis twice: using first cycles and using all cycles and therefore have rephrased this as follows:

      _“We tested whether the fractal cycle algorithm can detect skipped cycles, i.e., the cycles where an anticipated REM episode is skipped (possibly due to too high homeostatic pressure). We performed this analysis twice. First, we counted all skipped cycles (except the last cycles of a night, which might lack REM episode for other reasons, e.g., a participant had/was woken up). Second, we counted only the first classical cycles (i.e., the first cycle out of the 4 – 6 cycles that each participant had per night, Fig. 3 A – B) as these cy_cles coincide with the highest NREM pressure. An additional reason to disregard skipped cycles observed later during the night was our aim to achieve higher between-subject consistency as later skipped cycles were observed in only a small number of participants.” Results, section “Skipped cycles”, first paragraph.

      - The inclusion of all the hypnograms as a supplementary is a great idea to give the reader concrete intuition of the data. If the limits of the sleep cycles for both methods could be added it would be very useful.

      Supplementary Material 1 has been updated such that each graph has a mark showing the onsets of fractal and classical sleep cycles, including classical cycles with skipped REM sleep.

      - The difference in cycle duration between adults and children seems stronger / more reliable for the fractal cycle method, particularly in the histogram (Figure 3C). Is this difference statistically significant?

      In the revised Manuscript, we have added the Multivariate Analysis of Variance to compare F-values, partial R-squared and eta squared. The findings are as follows:

      “To compare the fractal approach with the classical one, we performed a Multivariate Analysis of Variance with fractal and classical cycle durations as dependent variables, the group as an independent variable and the age as a covariate. We found that fractal cycle durations showed higher F-values (F(1, 43)  \= 4.5 vs F(1, 43) = 3.1), adjusted R squared (0.138 vs 0.089) and effect sizes (partial eta squared 0.18 vs 0.13) than classical cycle durations.” Results, Fractal cycles in children and adolescents, paragraph 3.

      There have been some recent efforts to define sleep cycles in an automatic way using machine learning approaches. It could be interesting to mention these in the discussion and highlight their relevance to the general endeavour of automatizing the sleep cycle identification process.

      In the Discussion of the revised Manuscript, we have added the section on the existing automatic sleep cycle definition algorithms:

      “Even though recently, there has been a significant surge in sleep analysis incorporating various machine learning techniques and deep neural network architectures, we should stress that this research line mainly focused on the automatic classification of sleep stages and disorders almost ignoring the area of sleep cycles. Here, as a reference method, we used one of the very few available algorithms for sleep cycle detection (Blume & Cajochen, 2021). We found that automatically identified classical sleep cycles only moderately correlated with those detected by human raters (r’s = 0.3 – 0.7 in different datasets). These coefficients lay within the range of the coefficients between fractal and classical cycle durations (r = 0.41 – 0.55, moderate) and outside the range of the coefficients between classical cycle durations detected by two human scorers (r’s = 0.7 – 0.9, strong, Supplementary Material 2, Table S8).” Discussion, section “Fractal and classical cycles comparison”, paragraph 4.

      Reviewer #2 (Public Review):

      One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

      Thank you for this suggestion. In the revised Manuscript, we have added a new figure (Fig.S1 E, Supplementary Material 2), illustrating the goodness of fit of the data as assessed by the IRASA method.

      The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

      We agree that these cycles are not fractal per se. In the Introduction, when we mention them for the first time, we name them “fractal activity-based cycles of sleep” and immediately after that add “or fractal cycles for short”. In the revised version, we renewed this abbreviation with each new major section and in Abstract. Nevertheless, given that the term “fractal cycles” is used 88 times, after those “reminders”, we used the short name again to facilitate readability. We hope that this will highlight that the cycles are not fractal per se and thus reduce the possible confusion while keeping the manuscript short.

      The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

      In cases where the number of fractal cycles differed from the number of classical cycles (from 34 to 55% in different datasets as in the case of Fig.3B), we did not perform one-to-one matching of cycles. Instead, we averaged the duration of the fractal and classical cycles over each participant and only then correlated between them (Fig.2C). For a subset of the participants (45 – 66% of the participants in different datasets) with a one-to-one match between the fractal and classical cycles, we performed an additional correlation without averaging, i.e., we correlated the durations of individual fractal and classical cycles (Fig.4S of Supplementary Material 2). This is stated in the Methods, section Statistical analysis, paragraph 2.

      There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

      In the revised Manuscript, we have removed these statements from both Introduction and Discussion.

      The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

      In the revised Manuscript, we have removed the “NaN analysis” from both Results and Discussion. We have replaced it with the correlation between the difference between the durations of the classical and fractal cycles and proportion of wake after sleep onset. The finding is as follows:

      “A larger difference between the durations of the classical and fractal cycles was associated with a higher proportion of wake after sleep onset in 3/5 datasets as well as in the merged dataset (Supplementary Material 2, Table S10).” Results, section “Fractal cycles and wake after sleep onset”, last two sentences. This is also discussed in Discussion, section “Fractal cycles and age”, paragraph 1, last sentence. 

      To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

      Thank you for this important comment. Overall, our paper suggests that the fractal approach might reflect the cycling nature of sleep in a more precise and sensitive way than classical hypnograms. Importantly, neither fractal nor classical methods can shed light on the mechanism underlying sleep cycle generation due to their correlational approach. Despite this, the advantages of fractal over classical methods mentioned in our Manuscript are as follows:

      (1) Fractal cycles are based on a real-valued metric with known neurophysiological functional significance, which introduces a biological foundation and a more gradual impression of nocturnal changes compared to the abrupt changes that are inherent to hypnograms that use a rather arbitrary assigned categorical value (e.g., wake=0, REM=-1, N1=-2, N2=-3 and SWS=-4, Fig.2 A).

      (2) Fractal cycle computation is automatic and thus objective, whereas classical sleep cycle detection is usually based on the visual inspection of hypnograms, which is time-consuming, subjective and error-prone. Few automatic algorithms are available for sleep cycle detection, which only moderately correlated with classical cycles detected by human raters (r’s = 0.3 – 0.7 in different datasets here).

      (3) Defining the precise end of a classical sleep cycle with skipped REM sleep that is common in children, adolescents and young adults using a hypnogram is often difficult and arbitrary.   The fractal cycle algorithm could detect such cycles in 93% of cases while the hypnogram-based agreement on the presence/absence of skipped cycles between two independent human raters was 61% only; thus, 32% lower.

      (4) The fractal analysis showed a stronger effect size, higher F-value and R-squared than the classical analysis for the cycle duration comparison in children and adolescents vs young adults. The first and second fractal cycles were significantly shorter in the pediatric compared to the adult group, whereas the classical approach could not detect this difference.

      (5) Fractal – but not classical – cycle durations correlated with the age of adult participants.

      These bullets are now summarized in Table 5 that has been added to the Discussion of the revised manuscript.

    1. Extract the contents of the ZIP file in the folder on your local system linked to your GitHub repository.

      für windows: copy the folders into the designated remote repository README im Ziel ersetzen

    1. Welcome back.

      In this lesson, I'll be talking about Network Address Translation, or NAT, a process of giving a private resource outgoing only access to the internet.

      And a NAT gateway is the AWS implementation that's available within WPC.

      There's quite a bit of theory to cover, so let's get started.

      So what is NAT?

      Well, it stands for Network Address Translation.

      This is one of those terms which means more than people think that it does.

      In a strict sense, it's a set of different processes which can adjust ID packets by changing their source or destination IP addresses.

      Now, you've seen a form of this already.

      The internet gateway actually performs a type of NAT known as static NAT.

      It's how a resource can be allocated with a public IP version for address, and then when the packets of data leave those resources and pass through the internet gateway, it adjusts the source IP address on the packet from the private address to the public, and then sends the packet on, and then when the packet returns, it adjusts the destination address from the public IP address to the original private address.

      That's called static NAT, and that's how the internet gateway implements public IP version for addressing.

      Now, what most people think of when they think of NAT is a subset of NAT called IP Masquerading.

      And IP Masquerading hides a whole private side IP block behind a single public IP.

      So rather than the one private IP to one public IP process that the internet gateway does, NAT is many private IPs to one single IP.

      And this technique is popular because IP version 4 addresses are running out.

      The public address space is rapidly becoming exhausted.

      IP Masquerading, or what we'll refer to for the rest of this lesson as NAT, gives a whole private range of IP addresses outgoing only access to the public internet and the AWS public zone.

      I've highlighted outgoing because that's the most important part, because many private IPs use a single public IP.

      Incoming access doesn't work.

      Private devices that use NAT can initiate outgoing connections to internet or AWS public space services, and those connections can receive response data, but you cannot initiate connections from the public internet to these private IP addresses when NAT is used.

      It doesn't work that way.

      Now, AWS has two ways that it can provide NAT services.

      Historically, you could use an EC2 instance configured to provide NAT, but it's also a managed service, the NAT gateway, which you can provision in the VPC to provide the same functionality.

      So let's look at how this works architecturally.

      This is a simplified version of the Animals for Life architecture that we've been using so far.

      On the left is an application tier subnet in blue, and it's using the IP range 10.16.32.0/20.

      So this is a private only subnet.

      Inside it are three instances, I01, which is using the IP 10.16.32.10, I02, which is using 32.20, and I03, which is using 32.30.

      These IP addresses are private, so they're not publicly routable.

      They cannot communicate with the public internet or the AWS public zone services.

      These addresses cannot be routed across a public style network.

      Now, if we wanted this to be allowed, if we wanted these instances to perform certain activities using public networking, for example, software updates, how would we do it?

      Well, we could make the subnet's public in the same way that we've done with the public subnets or the web subnets, but we might not want to do that architecturally.

      With this multi-tier architecture that we're implementing together, part of the design logic is to have tiers which aren't public and aren't accessible from the public internet.

      Now, we could also host some kind of software update server inside the VPC, and some businesses choose to do that.

      Some businesses run Windows update services, all Linux update services inside their private network, but that comes with an admin overhead.

      NAT offers us a third option, and it works really well in this style of situation.

      We provision a NAT gateway into a public subnet, and remember, the public subnet allows us to use public IP addresses.

      The public subnet has a route table attached to it, which provides default IP version 4 routes pointing at the internet gateway.

      So, because the NAT gateway is located in this public web subnet, it has a public IP which is routable across the public internet, so it's now able to send data out and get data back in return.

      Now, the private subnet where the instances are located can also have its own route table, and this route table can be different than the public subnet route table.

      So, we could configure it so that the route table that's on the application subnet has a default IP version 4 route, but this time, instead of pointing at the internet gateway, like the web subnet users, we configure this private route table so that it points at the NAT gateway.

      This means when those instances are sending any data to any IP addresses that do not belong inside the VPC, by default, this default route will be used, and that traffic will get sent to the NAT gateway.

      So, let's have a look at how this packet flow works.

      Let's simulate the flow packets from one of the private instances and see what the NAT gateway actually does.

      So, first, instance 1 generates some data.

      Let's assume that it's looking for software updates.

      So, this packet has a source IP address of instance 1's private IP and a destination of 1.3.3.7.

      For this example, let's assume that that's a software update server.

      Now, because we have this default route on the route table of the application subnet, that packet is routed through to the NAT gateway.

      The NAT gateway makes a record of the data packet.

      It stores the destination that the packet is for, the source address of the instance sending it, and other details which help it identify the specific communication in future.

      Remember, multiple instances can be communicating at once, and for each instance, it could be having multiple conversations with different public internet hosts.

      So, the NAT gateway needs to be able to uniquely identify those.

      So, it records the IP addresses involved, the source and destination, the port numbers, everything it needs, into a translation table.

      So, the NAT gateway maintains something called a translation table which records all of this information.

      And then, it adjusts the packet to the one that's been sent by the instance, and it changes the source address of this IP packet to be its own source address.

      Now, if this NAT appliance were anywhere for AWS, what it would do right now is adjust the packet with a public routable address. - Hi. - Let's do this directly.

      But remember, all the inside of the IPC really has directly attached to it a public IP version 4 address.

      That's what the internet gateway does.

      So, the NAT gateway, because it's in the web subnet, it has a default route, and this default route points at the internet gateway.

      And so, the packet is moved from the NAT gateway to the internet gateway by the IPC router.

      At this point, the internet gateway knows that this packet is from the NAT gateway.

      It knows that the NAT gateway has a public IP version 4 address associated with it, and so, it modifies the packet to have a source address of the NAT gateway's public address, and it sends it on its way.

      The NAT gateway's job is to allow multiple private IP addresses to masquerade behind the IP address that it has.

      That's where the term IP masquerading comes from.

      That's why it's more accurate.

      So, the NAT gateway takes all of the incoming packets from all of the instances that it's managing, and it records all the information about the communication.

      It takes those packets, it changes the source address from being those instances to its own IP address, its own external-facing IP address.

      If it was outside AWS, this would be a public address directly.

      That's how your internet router works for your home network.

      All of the devices internally on your network talk out using one external IP address, your home router uses NAT.

      But because it's in AWS, it doesn't have directly attached a real public IP.

      The internet gateway translates from its IP address to the associated public one.

      So, that's how the flow works.

      If you need to give an instance its own public IP version for address, then only the internet gateway is required.

      If you want to give private instances outgoing access to the internet and the AWS public services such as S3, then you need both the NAT gateway to do this many-to-one translation and the internet gateway to translate from the IP of the NAT gateway to a real public IP version for address.

      Now, let's quickly run through some of the key facts for the NAT gateway product that you'll be implementing in the next demo lesson.

      First, and I hope this is logical for you by now, it needs to run from a public subnet because it needs to be able to be assigned a public IP version for IP address for itself.

      So, to deploy a NAT gateway, you already need your VPC in a position where it has public subnets.

      And for that, you need an internet gateway, subnets configured to allocate public IP version for addresses and default routes for those subnets pointing at the internet gateway.

      Now, a NAT gateway actually uses a special type of public IP version for address that we haven't covered yet called an elastic IP.

      For now, just know that these are IP version for addresses, which is static.

      They don't change.

      These IP addresses are allocated to your account in a region and they can be used for whatever you want until you reallocate them.

      And NAT gateways use these elastic IPs, the one service which utilizes elastic IPs.

      Now, they're talking about elastic IPs later on in the course.

      Now, NAT gateways are an AZ resilient service.

      If you read the AWS documentation, you might get the impression that they're fully resilient in a region like an internet gateway.

      They're not, they're resilient in the AZ that they're in.

      So they can recover from hardware failure inside an AZ.

      But if an AZ entirely fails, then the NAT gateway will also fail.

      For a fully region resilient service, so to mirror the high availability provided by an internet gateway, then you need to deploy one NAT gateway in each AZ that you're using in the VPC and then have a route table for private subnets in that availability zone, pointing at the NAT gateway also in that availability zone.

      So for every availability zone that you use, you need one NAT gateway and one route table pointing at that NAT gateway.

      Now, they aren't super expensive, but it can get costly if you have lots of availability zones, which is why it's important to always think about your VPC design.

      Now, NAT gateways are a managed service.

      You deploy them and AWS handle everything else.

      They can scale to 45 gigabits per second in bandwidth and you can always deploy multiple NAT gateways and split your subnets across multiple provision products.

      If you need more bandwidth, you can just deploy more NAT gateways.

      For example, you could split heavy consumers across two different subnets in the same AZ, have two NAT gateways in the same AZ and just route each of those subnets to a different NAT gateway and that would quickly allow you to double your available bandwidth.

      With NAT gateways, you'll build based on the number that you have.

      So there's a standard hourly charge for running a NAT gateway and this is obviously subject to change in a different region, but it's currently about four cents per hour.

      And note, this is actually an hourly charge.

      So partial hours are billed as full hours.

      And there's also a data processing charge.

      So that's the same amount as the hourly charge around four cents currently per gigabyte of processed data.

      So you've got this base charge that a NAT gateway consumes while running plus a charge based on the amount of data that you process.

      So keep both of those things in mind for any NAT gateway related questions in the exam.

      Don't focus on the actual values, just focus on the fact they have two charging elements.

      Okay, so this is the end of part one of this lesson.

      It's getting a little bit on the long side, and so I wanted to add a break.

      It's an opportunity just to take a rest or grab a coffee.

      Part two will be continuing immediately from the end of part one.

      So go ahead, complete the video, and when you're ready, join me in part two.