10,000 Matching Annotations
  1. Jul 2024
    1. Reviewer #1 (Public Review):

      This study uses MEG to test for a neural signature of the trial history effect known as 'serial dependence.' This is a behavioral phenomenon whereby stimuli are judged to be more similar than they really are, in feature space, to stimuli that were relevant in the recent past (i.e., the preceding trials). This attractive bias is prevalent across stimulus classes and modalities, but a neural source has been elusive. This topic has generated great interest in recent years, and I believe this study makes a unique contribution to the field. The paper is overall clear and compelling, and makes effective use of data visualizations to illustrate the findings. Below, I list several points where I believe further detail would be important to interpreting the results. I also make suggestions for additional analyses that I believe would enrich understanding but are inessential to the main conclusions.

      (1) In the introduction, I think the study motivation could be strengthened, to clarify the importance of identifying a neural signature here. It is clear that previous studies have focused mainly on behavior, and that the handful of neuroscience investigations have found only indirect signatures. But what would the type of signature being sought here tell us? How would it advance understanding of the underlying processes, the function of serial dependence, or the theoretical debates around the phenomenon?

      (1a) As one specific point of clarification, on p. 5, lines 91-92, a previous study (St. John-Saaltink et al.) is described as part of the current study motivation, stating that "as the current and previous orientations were either identical or orthogonal to each other, it remained unclear whether this neural bias reflected an attraction or repulsion in relation to the past." I think this statement could be more explicit as to why/how these previous findings are ambiguous. The St. John-Saaltink study stands as one of very few that may be considered to show evidence of an early attractive effect in neural activity, so it would help to clarify what sort of advance the current study represents beyond that.

      (1b) The study motivation might also consider the findings of Ranieri et al (2022, J. Neurosci) Fornaciai, Togoli, & Bueti (2023, J. Neurosci), and Luo & Collins (2023, J. Neurosci) who all test various neural signatures of serial dependence.

      (2) Regarding the methods and results, it would help if the initial description of the reconstruction approach, in the main text, gave more context about what data is going into reconstruction (e.g., which sensors), a more conceptual overview of what the 'reconstruction' entails, and what the fidelity metric indexes. To me, all of that is important to interpreting the figures and results. For instance, when I first read, it was unclear to me what it meant to "reconstruct the direction of S1 during the S2 epoch" (p. 10, line 199)? As in, I couldn't tell how the data/model knows which item it is reconstructing, as opposed to just reporting whatever directional information is present in the signal.

      (2a) Relatedly, what does "reconstruction strength" reflect in Figure 2a? Is this different than the fidelity metric? Does fidelity reflect the strength of the particular relevant direction, or does it just mean that there is a high level of any direction information in the signal?

      (3) Then in the Methods, it would help to provide further detail still about the IEM training/testing procedure. For instance, it's not entirely clear to me whether all the analyses use the same model (i.e., all trained on stimulus encoding) or whether each epoch and timepoint is trained on the corresponding epoch and timepoint from the other session. This speaks to whether the reconstructions reflect a shared stimulus code across different conditions vs. that stimulus information about various previous and current trial items can be extracted if the model is tailored accordingly. Specifically, when you say "aim of the reconstruction" (p. 31, line 699), does that simply mean the reconstruction was centered in that direction (that the same data would go into reconstructing S1 or S2 in a given epoch, and what would differentiate between them is whether the reconstruction was centered to the S1 or S2 direction value)? Or were S1 and S2 trained and tested separately for the same epoch? And was training and testing all within the same time point (i.e., train on delay, test on delay), or train on the encoding of a given item, then test the fidelity of that stimulus code under various conditions?

      (3a) I think training and testing were done separately for each epoch and timepoint, but this could have important implications for interpreting the results. Namely if the models are trained and tested on different time points, and reference directions, then some will be inherently noisier than others (e.g., delay period more so than encoding), and potentially more (or differently) susceptible to bias. For instance, the S1 and S2 epochs show no attractive bias, but they may also be based on more high-fidelity training sets (i.e., encoding), and therefore less susceptible to the bias that is evident in the retrocue epoch.

      (4) I believe the work would benefit from a further effort to reconcile these results with previous findings (i.e., those that showed repulsion, like Sheehan & Serences), potentially through additional analyses. The discussion attributes the difference in findings to the "combination of a retro-cue paradigm with the high temporal resolution of MEG," but it's unclear how that explains why various others observed repulsion (thought to happen quite early) that is not seen at any stage here. In my view, the temporal (as well as spatial) resolution of MEG could be further exploited here to better capture the early vs. late stages of processing. For instance, by separately examining earlier vs. later time points (instead of averaging across all of them), or by identifying and analyzing data in the sensors that might capture early vs. late stages of processing. Indeed, the S1 and S2 reconstructions show subtle repulsion, which might be magnified at earlier time points but then shift (toward attraction) at later time points, thereby counteracting any effect. Likewise, the S1 reconstruction becomes biased during the S2 epoch, consistent with previous observations that the SD effects grow across a WM delay. Maybe both S1 and S2 would show an attractive bias emerging during the later (delay) portion of their corresponding epoch? As is, the data nicely show that an attractive bias can be detected in the retrocue period activity, but they could still yield further specificity about when and where that bias emerges.

      (5) A few other potentially interesting (but inessential considerations): A benchmark property of serial dependence is its feature-specificity, in that the attractive bias occurs only between current and previous stimuli that are within a certain range of similarity to each other in feature space. I would be very curious to see if the neural reconstructions manifest this principle - for instance, if one were to plot the trialwise reconstruction deviation from 0, across the full space of current-previous trial distances, as in the behavioral data. Likewise, something that is not captured by the DoG fitting approach, but which this dataset may be in a position to inform, is the commonly observed (but little understood) repulsive effect that appears when current and previous stimuli are quite distinct from each other. As in, Figure 1b shows an attractive bias for direction differences around 30 degrees, but a repulsive one for differences around 170 degrees - is there a corresponding neural signature for this component of the behavior?

    1. Auth needs to be pluggable. — Jacob Kaplan-Moss, "REST worst practices" Authentication is the mechanism of associating an incoming request with a set of identifying credentials, such as the user the request came from, or the token that it was signed with. The permission and throttling policies can then use those credentials to determine if the request should be permitted. REST framework provides several authentication schemes out of the box, and also allows you to implement custom schemes. Authentication always runs at the very start of the view, before the permission and throttling checks occur, and before any other code is allowed to proceed. The request.user property will typically be set to an instance of the contrib.auth package's User class. The request.auth property is used for any additional authentication information, for example, it may be used to represent an authentication token that the request was signed with.

      In simple terms, let's break down the concept of pluggable authentication and the key points from the text with examples:

      Pluggable Authentication

      Pluggable authentication means that the system should be flexible and allow different ways to verify who a user is. Think of it like having different keys for the same door, where each key represents a different method of proving your identity.

      Key Points and Examples

      1. What is Authentication?
      2. Authentication is like checking an ID card at the entrance of a building to ensure the person trying to enter is who they say they are.
      3. Example: When you log in to a website, you might enter a username and password. This process verifies your identity.

      4. Why Should Auth be Pluggable?

      5. Different applications or parts of an application might need different methods to verify identity.
      6. Example: One part of your app might use a username and password, while another might use a fingerprint or a token sent to your phone.

      7. REST Framework's Role:

      8. The REST framework provides various built-in ways to handle authentication, and it allows developers to add custom methods.
      9. Example: The REST framework might support OAuth (logging in with Google), token authentication (using a special code), and basic authentication (username and password) out of the box.

      10. When Does Authentication Happen?

      11. Authentication happens first, before anything else in the request process. This ensures only verified users can access further functionalities.
      12. Example: Before checking if a user has permission to view a page or how many times they've accessed it, the system first confirms who the user is.

      13. request.user and request.auth Properties:

      14. request.user: This property holds the user's details once they've been authenticated.
      15. Example: After logging in, request.user might store information like the user's name, email, and roles.
      16. request.auth: This property holds any additional authentication information, like tokens.
      17. Example: If you log in using a token sent to your email, this token will be stored in request.auth.

      Simplified Summary

      Authentication needs to be adaptable, allowing different methods to verify user identity. The REST framework supports multiple built-in ways and custom methods for authentication, ensuring it runs first before any other checks. Once authenticated, user details are stored in request.user, and any extra authentication data (like tokens) is stored in request.auth.

      Real-life Example

      Imagine a school with multiple entrances:

      • Main Entrance: Students show their student ID (username and password).
      • VIP Entrance: Teachers use a fingerprint scanner (biometric authentication).
      • Emergency Entrance: Parents receive a temporary access code (token authentication).

      Each entrance verifies identity differently, but all lead into the same school, ensuring only authorized people get in. Similarly, a pluggable authentication system in an application allows different methods to verify users based on the situation.

    1. Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors train a simple machine learning to improve the ability of AlphaFold-multimers ability to separate interacting from non-interacting pairs. The improvement is small compared with the default AlphaFold score (AUROC from 0.84 to 0.88).

      Strengths:

      The dataset seems to be carefully constructed.

      Weaknesses:

      The comparison with the state of the art is limited.<br /> - pDockQ comparison is (likely) incorrect (v2.1 should be used, not v1.0).<br /> - Comparison with ipTM should be complemented with RankingConfidence (the default AF2-score).<br /> - Several other scores than pDockQ have been developed for this task.<br /> - Other methods (by Jianlin Chen) to "improve" quality assessment of AF2-models have been presented - these should at least be cited.

      Lack of ablation studies:

      - Quite likely the most significant contributor is the ipTM (and other scores from AF2). This should be analyzed and discussed.

      Lack of data:

      - The GitHub repository does not contain the models - so the data can not be examined carefully. Nor can the model be retrained.

      - No license is provided for the code in the Git repository.

    1. ViewSets After routing has determined which controller to use for a request, your controller is responsible for making sense of the request and producing the appropriate output. — Ruby on Rails Documentation Django REST framework allows you to combine the logic for a set of related views in a single class, called a ViewSet. In other frameworks you may also find conceptually similar implementations named something like 'Resources' or 'Controllers'. A ViewSet class is simply a type of class-based View, that does not provide any method handlers such as .get() or .post(), and instead provides actions such as .list() and .create(). The method handlers for a ViewSet are only bound to the corresponding actions at the point of finalizing the view, using the .as_view() method. Typically, rather than explicitly registering the views in a viewset in the urlconf, you'll register the viewset with a router class, that automatically determines the urlconf for you. Example Let's define a simple viewset that can be used to list or retrieve all the users in the system. from django.contrib.auth.models import User from django.shortcuts import get_object_or_404 from myapps.serializers import UserSerializer from rest_framework import viewsets from rest_framework.response import Response class UserViewSet(viewsets.ViewSet): """ A simple ViewSet for listing or retrieving users. """ def list(self, request): queryset = User.objects.all() serializer = UserSerializer(queryset, many=True) return Response(serializer.data) def retrieve(self, request, pk=None): queryset = User.objects.all() user = get_object_or_404(queryset, pk=pk) serializer = UserSerializer(user) return Response(serializer.data)

      Sure, let's break down the concept of ViewSets in Django REST Framework into simpler terms with examples.

      What is a ViewSet?

      In Django REST Framework, a ViewSet is a way to combine the logic for a set of related views into a single class. Instead of writing separate classes or functions for handling different HTTP methods like GET, POST, PUT, DELETE, etc., you can define these in one class.

      How ViewSets Work

      1. ViewSet Class: A ViewSet is a type of class-based view. Unlike regular class-based views where you define methods like .get() or .post(), ViewSets use actions like .list() and .create().

      2. Router: Instead of manually adding each URL for these views, you can use a router that automatically generates the URL patterns for the ViewSet.

      Example

      Let's say we want to create a simple API to list all users or get details of a specific user.

      Step 1: Create a Serializer

      First, we need a serializer to convert our User model to JSON format.

      ```python

      myapps/serializers.py

      from django.contrib.auth.models import User from rest_framework import serializers

      class UserSerializer(serializers.ModelSerializer): class Meta: model = User fields = ['id', 'username', 'email'] ```

      Step 2: Define the ViewSet

      Next, we define a ViewSet that handles listing users and retrieving a specific user.

      ```python

      views.py

      from django.contrib.auth.models import User from django.shortcuts import get_object_or_404 from myapps.serializers import UserSerializer from rest_framework import viewsets from rest_framework.response import Response

      class UserViewSet(viewsets.ViewSet): """ A simple ViewSet for listing or retrieving users. """ def list(self, request): queryset = User.objects.all() serializer = UserSerializer(queryset, many=True) return Response(serializer.data)

      def retrieve(self, request, pk=None):
          queryset = User.objects.all()
          user = get_object_or_404(queryset, pk=pk)
          serializer = UserSerializer(user)
          return Response(serializer.data)
      

      ```

      • list(): This method handles GET requests to list all users. It fetches all users from the database, serializes them, and returns the JSON response.
      • retrieve(): This method handles GET requests to get details of a specific user based on the primary key (pk).

      Step 3: Register the ViewSet with a Router

      Finally, we use a router to generate the URL patterns for our ViewSet.

      ```python

      urls.py

      from django.urls import path, include from rest_framework.routers import DefaultRouter from .views import UserViewSet

      router = DefaultRouter() router.register(r'users', UserViewSet, basename='user')

      urlpatterns = [ path('', include(router.urls)), ] ```

      Summary

      • ViewSet: A class that groups related views (e.g., list and retrieve) into a single class.
      • Actions: Methods like .list() and .retrieve() that handle specific actions.
      • Router: Automatically generates URL patterns for the ViewSet.

      Using ViewSets and routers simplifies the code and makes it easier to manage related views.

    1. Mixins The mixin classes provide the actions that are used to provide the basic view behavior. Note that the mixin classes provide action methods rather than defining the handler methods, such as .get() and .post(), directly. This allows for more flexible composition of behavior. The mixin classes can be imported from rest_framework.mixins. ListModelMixin Provides a .list(request, *args, **kwargs) method, that implements listing a queryset. If the queryset is populated, this returns a 200 OK response, with a serialized representation of the queryset as the body of the response. The response data may optionally be paginated. CreateModelMixin Provides a .create(request, *args, **kwargs) method, that implements creating and saving a new model instance. If an object is created this returns a 201 Created response, with a serialized representation of the object as the body of the response. If the representation contains a key named url, then the Location header of the response will be populated with that value. If the request data provided for creating the object was invalid, a 400 Bad Request response will be returned, with the error details as the body of the response. RetrieveModelMixin Provides a .retrieve(request, *args, **kwargs) method, that implements returning an existing model instance in a response. If an object can be retrieved this returns a 200 OK response, with a serialized representation of the object as the body of the response. Otherwise, it will return a 404 Not Found. UpdateModelMixin Provides a .update(request, *args, **kwargs) method, that implements updating and saving an existing model instance. Also provides a .partial_update(request, *args, **kwargs) method, which is similar to the update method, except that all fields for the update will be optional. This allows support for HTTP PATCH requests. If an object is updated this returns a 200 OK response, with a serialized representation of the object as the body of the response. If the request data provided for updating the object was invalid, a 400 Bad Request response will be returned, with the error details as the body of the response. DestroyModelMixin Provides a .destroy(request, *args, **kwargs) method, that implements deletion of an existing model instance. If an object is deleted this returns a 204 No Content response, otherwise it will return a 404 Not Found. Concrete View Classes The following classes are the concrete generic views. If you're using generic views this is normally the level you'll be working at unless you need heavily customized behavior. The view classes can be imported from rest_framework.generics. CreateAPIView Used for create-only endpoints. Provides a post method handler. Extends: GenericAPIView, CreateModelMixin ListAPIView Used for read-only endpoints to represent a collection of model instances. Provides a get method handler. Extends: GenericAPIView, ListModelMixin RetrieveAPIView Used for read-only endpoints to represent a single model instance. Provides a get method handler. Extends: GenericAPIView, RetrieveModelMixin DestroyAPIView Used for delete-only endpoints for a single model instance. Provides a delete method handler. Extends: GenericAPIView, DestroyModelMixin UpdateAPIView Used for update-only endpoints for a single model instance. Provides put and patch method handlers. Extends: GenericAPIView, UpdateModelMixin ListCreateAPIView Used for read-write endpoints to represent a collection of model instances. Provides get and post method handlers. Extends: GenericAPIView, ListModelMixin, CreateModelMixin RetrieveUpdateAPIView Used for read or update endpoints to represent a single model instance. Provides get, put and patch method handlers. Extends: GenericAPIView, RetrieveModelMixin, UpdateModelMixin RetrieveDestroyAPIView Used for read or delete endpoints to represent a single model instance. Provides get and delete method handlers. Extends: GenericAPIView, RetrieveModelMixin, DestroyModelMixin RetrieveUpdateDestroyAPIView Used for read-write-delete endpoints to represent a single model instance. Provides get, put, patch and delete method handlers. Extends: GenericAPIView, RetrieveModelMixin, UpdateModelMixin, DestroyModelMixin Customizing the generic views Often you'll want to use the existing generic views, but use some slightly customized behavior. If you find yourself reusing some bit of customized behavior in multiple places, you might want to refactor the behavior into a common class that you can then just apply to any view or viewset as needed. Creating custom mixins For example, if you need to lookup objects based on multiple fields in the URL conf, you could create a mixin class like the following: class MultipleFieldLookupMixin: """ Apply this mixin to any view or viewset to get multiple field filtering based on a `lookup_fields` attribute, instead of the default single field filtering. """ def get_object(self): queryset = self.get_queryset() # Get the base queryset queryset = self.filter_queryset(queryset) # Apply any filter backends filter = {} for field in self.lookup_fields: if self.kwargs.get(field): # Ignore empty fields. filter[field] = self.kwargs[field] obj = get_object_or_404(queryset, **filter) # Lookup the object self.check_object_permissions(self.request, obj) return obj You can then simply apply this mixin to a view or viewset anytime you need to apply the custom behavior. class RetrieveUserView(MultipleFieldLookupMixin, generics.RetrieveAPIView): queryset = User.objects.all() serializer_class = UserSerializer lookup_fields = ['account', 'username'] Using custom mixins is a good option if you have custom behavior that needs to be used. Creating custom base classes If you are using a mixin across multiple views, you can take this a step further and create your own set of base views that can then be used throughout your project. For example: class BaseRetrieveView(MultipleFieldLookupMixin, generics.RetrieveAPIView): pass class BaseRetrieveUpdateDestroyView(MultipleFieldLookupMixin, generics.RetrieveUpdateDestroyAPIView): pass Using custom base classes is a good option if you have custom behavior that consistently needs to be repeated across a large number of views throughout your project. PUT as create Prior to version 3.0 the REST framework mixins treated PUT as either an update or a create operation, depending on if the object already existed or not. Allowing PUT as create operations is problematic, as it necessarily exposes information about the existence or non-existence of objects. It's also not obvious that transparently allowing re-creating of previously deleted instances is necessarily a better default behavior than simply returning 404 responses. Both styles "PUT as 404" and "PUT as create" can be valid in different circumstances, but from version 3.0 onwards we now use 404 behavior as the default, due to it being simpler and more obvious. If you need to generic PUT-as-create behavior you may want to include something like this AllowPUTAsCreateMixin class as a mixin to your views. Third party packages The following third party packages provide additional generic view implementations. Django Rest Multiple Models Django Rest Multiple Models provides a generic view (and mixin) for sending multiple serialized models and/or querysets via a single API request. D

      Mixins in Django REST Framework: Simplified Explanation

      Mixins are small, reusable classes that provide specific behavior to a view class. They are like building blocks that you can combine to create custom views. Instead of defining the handler methods like .get() or .post() directly, mixins provide action methods which allow for more flexible composition of behaviors.

      Common Mixin Classes

      Here are some commonly used mixin classes from rest_framework.mixins:

      1. ListModelMixin
      2. Purpose: Provides the ability to list a queryset.
      3. Method: .list(request, *args, **kwargs)
      4. Example: ```python from rest_framework import generics, mixins from django.contrib.auth.models import User from myapp.serializers import UserSerializer

        class UserList(mixins.ListModelMixin, generics.GenericAPIView): queryset = User.objects.all() serializer_class = UserSerializer

         def get(self, request, *args, **kwargs):
             return self.list(request, *args, **kwargs)
        

        ```

      5. CreateModelMixin

      6. Purpose: Provides the ability to create a new model instance.
      7. Method: .create(request, *args, **kwargs)
      8. Example: ```python class UserCreate(mixins.CreateModelMixin, generics.GenericAPIView): queryset = User.objects.all() serializer_class = UserSerializer

         def post(self, request, *args, **kwargs):
             return self.create(request, *args, **kwargs)
        

        ```

      9. RetrieveModelMixin

      10. Purpose: Provides the ability to retrieve a single model instance.
      11. Method: .retrieve(request, *args, **kwargs)
      12. Example: ```python class UserDetail(mixins.RetrieveModelMixin, generics.GenericAPIView): queryset = User.objects.all() serializer_class = UserSerializer

         def get(self, request, *args, **kwargs):
             return self.retrieve(request, *args, **kwargs)
        

        ```

      13. UpdateModelMixin

      14. Purpose: Provides the ability to update an existing model instance.
      15. Methods: .update(request, *args, **kwargs), .partial_update(request, *args, **kwargs)
      16. Example: ```python class UserUpdate(mixins.UpdateModelMixin, generics.GenericAPIView): queryset = User.objects.all() serializer_class = UserSerializer

         def put(self, request, *args, **kwargs):
             return self.update(request, *args, **kwargs)
        
         def patch(self, request, *args, **kwargs):
             return self.partial_update(request, *args, **kwargs)
        

        ```

      17. DestroyModelMixin

      18. Purpose: Provides the ability to delete an existing model instance.
      19. Method: .destroy(request, *args, **kwargs)
      20. Example: ```python class UserDelete(mixins.DestroyModelMixin, generics.GenericAPIView): queryset = User.objects.all() serializer_class = UserSerializer
         def delete(self, request, *args, **kwargs):
             return self.destroy(request, *args, **kwargs)
        

        ```

      Concrete View Classes

      Concrete view classes combine generic views and mixins to provide common patterns. Here are some examples:

      1. CreateAPIView
      2. Purpose: Create-only endpoints.
      3. Usage: ```python from rest_framework import generics

        class UserCreateView(generics.CreateAPIView): queryset = User.objects.all() serializer_class = UserSerializer ```

      4. ListAPIView

      5. Purpose: Read-only endpoints for a collection of model instances.
      6. Usage: python class UserListView(generics.ListAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      7. RetrieveAPIView

      8. Purpose: Read-only endpoints for a single model instance.
      9. Usage: python class UserDetailView(generics.RetrieveAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      10. DestroyAPIView

      11. Purpose: Delete-only endpoints for a single model instance.
      12. Usage: python class UserDeleteView(generics.DestroyAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      13. UpdateAPIView

      14. Purpose: Update-only endpoints for a single model instance.
      15. Usage: python class UserUpdateView(generics.UpdateAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      16. ListCreateAPIView

      17. Purpose: Read-write endpoints for a collection of model instances.
      18. Usage: python class UserListCreateView(generics.ListCreateAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      19. RetrieveUpdateAPIView

      20. Purpose: Read or update endpoints for a single model instance.
      21. Usage: python class UserRetrieveUpdateView(generics.RetrieveUpdateAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      22. RetrieveDestroyAPIView

      23. Purpose: Read or delete endpoints for a single model instance.
      24. Usage: python class UserRetrieveDestroyView(generics.RetrieveDestroyAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      25. RetrieveUpdateDestroyAPIView

      26. Purpose: Read-write-delete endpoints for a single model instance.
      27. Usage: python class UserRetrieveUpdateDestroyView(generics.RetrieveUpdateDestroyAPIView): queryset = User.objects.all() serializer_class = UserSerializer

      Customizing Generic Views with Mixins

      You can create custom mixins to encapsulate specific behaviors and reuse them across multiple views. Here's an example of a custom mixin for looking up objects based on multiple fields:

      python class MultipleFieldLookupMixin: """ Apply this mixin to any view or viewset to get multiple field filtering based on a `lookup_fields` attribute, instead of the default single field filtering. """ def get_object(self): queryset = self.get_queryset() # Get the base queryset queryset = self.filter_queryset(queryset) # Apply any filter backends filter = {} for field in self.lookup_fields: if self.kwargs.get(field): # Ignore empty fields. filter[field] = self.kwargs[field] obj = get_object_or_404(queryset, **filter) # Lookup the object self.check_object_permissions(self.request, obj) return obj

      Using Custom Mixins

      You can use the custom mixin with any view to apply the custom behavior:

      python class RetrieveUserView(MultipleFieldLookupMixin, generics.RetrieveAPIView): queryset = User.objects.all() serializer_class = UserSerializer lookup_fields = ['account', 'username']

      Custom Base Classes

      If you frequently use a mixin across multiple views, create custom base classes:

      ```python class BaseRetrieveView(MultipleFieldLookupMixin, generics.RetrieveAPIView): pass

      class BaseRetrieveUpdateDestroyView(MultipleFieldLookupMixin, generics.RetrieveUpdateDestroyAPIView): pass ```

      This way, you can reuse the custom behavior consistently across your project.

      In summary, mixins in Django REST Framework allow you to compose views with reusable actions, making your code modular and easier to maintain. The combination of mixins and generic views helps you quickly build standard CRUD (Create, Read, Update, Delete) operations with minimal code.

    2. Generic views Django’s generic views... were developed as a shortcut for common usage patterns... They take certain common idioms and patterns found in view development and abstract them so that you can quickly write common views of data without having to repeat yourself. — Django Documentation One of the key benefits of class-based views is the way they allow you to compose bits of reusable behavior. REST framework takes advantage of this by providing a number of pre-built views that provide for commonly used patterns. The generic views provided by REST framework allow you to quickly build API views that map closely to your database models. If the generic views don't suit the needs of your API, you can drop down to using the regular APIView class, or reuse the mixins and base classes used by the generic views to compose your own set of reusable generic views. Examples Typically when using the generic views, you'll override the view, and set several class attributes. from django.contrib.auth.models import User from myapp.serializers import UserSerializer from rest_framework import generics from rest_framework.permissions import IsAdminUser class UserList(generics.ListCreateAPIView): queryset = User.objects.all() serializer_class = UserSerializer permission_classes = [IsAdminUser] For more complex cases you might also want to override various methods on the view class. For example. class UserList(generics.ListCreateAPIView): queryset = User.objects.all() serializer_class = UserSerializer permission_classes = [IsAdminUser] def list(self, request): # Note the use of `get_queryset()` instead of `self.queryset` queryset = self.get_queryset() serializer = UserSerializer(queryset, many=True) return Response(serializer.data) For very simple cases you might want to pass through any class attributes using the .as_view() method. For example, your URLconf might include something like the following entry: path('users/', ListCreateAPIView.as_view(queryset=User.objects.all(), serializer_class=UserSerializer), name='user-list') API Reference GenericAPIView This class extends REST framework's APIView class, adding commonly required behavior for standard list and detail views. Each of the concrete generic views provided is built by combining GenericAPIView, with one or more mixin classes. Attributes Basic settings: The following attributes control the basic view behavior. queryset - The queryset that should be used for returning objects from this view. Typically, you must either set this attribute, or override the get_queryset() method. If you are overriding a view method, it is important that you call get_queryset() instead of accessing this property directly, as queryset will get evaluated once, and those results will be cached for all subsequent requests. serializer_class - The serializer class that should be used for validating and deserializing input, and for serializing output. Typically, you must either set this attribute, or override the get_serializer_class() method. lookup_field - The model field that should be used for performing object lookup of individual model instances. Defaults to 'pk'. Note that when using hyperlinked APIs you'll need to ensure that both the API views and the serializer classes set the lookup fields if you need to use a custom value. lookup_url_kwarg - The URL keyword argument that should be used for object lookup. The URL conf should include a keyword argument corresponding to this value. If unset this defaults to using the same value as lookup_field. Pagination: The following attributes are used to control pagination when used with list views. pagination_class - The pagination class that should be used when paginating list results. Defaults to the same value as the DEFAULT_PAGINATION_CLASS setting, which is 'rest_framework.pagination.PageNumberPagination'. Setting pagination_class=None will disable pagination on this view. Filtering: filter_backends - A list of filter backend classes that should be used for filtering the queryset. Defaults to the same value as the DEFAULT_FILTER_BACKENDS setting. Methods Base methods: get_queryset(self) Returns the queryset that should be used for list views, and that should be used as the base for lookups in detail views. Defaults to returning the queryset specified by the queryset attribute. This method should always be used rather than accessing self.queryset directly, as self.queryset gets evaluated only once, and those results are cached for all subsequent requests. May be overridden to provide dynamic behavior, such as returning a queryset, that is specific to the user making the request. For example: def get_queryset(self): user = self.request.user return user.accounts.all() Note: If the serializer_class used in the generic view spans orm relations, leading to an n+1 problem, you could optimize your queryset in this method using select_related and prefetch_related. To get more information about n+1 problem and use cases of the mentioned methods refer to related section in django documentation. get_object(self) Returns an object instance that should be used for detail views. Defaults to using the lookup_field parameter to filter the base queryset. May be overridden to provide more complex behavior, such as object lookups based on more than one URL kwarg. For example: def get_object(self): queryset = self.get_queryset() filter = {} for field in self.multiple_lookup_fields: filter[field] = self.kwargs[field] obj = get_object_or_404(queryset, **filter) self.check_object_permissions(self.request, obj) return obj Note that if your API doesn't include any object level permissions, you may optionally exclude the self.check_object_permissions, and simply return the object from the get_object_or_404 lookup. filter_queryset(self, queryset) Given a queryset, filter it with whichever filter backends are in use, returning a new queryset. For example: def filter_queryset(self, queryset): filter_backends = [CategoryFilter] if 'geo_route' in self.request.query_params: filter_backends = [GeoRouteFilter, CategoryFilter] elif 'geo_point' in self.request.query_params: filter_backends = [GeoPointFilter, CategoryFilter] for backend in list(filter_backends): queryset = backend().filter_queryset(self.request, queryset, view=self) return queryset get_serializer_class(self) Returns the class that should be used for the serializer. Defaults to returning the serializer_class attribute. May be overridden to provide dynamic behavior, such as using different serializers for read and write operations, or providing different serializers to different types of users. For example: def get_serializer_class(self): if self.request.user.is_staff: return FullAccountSerializer return BasicAccountSerializer Save and deletion hooks: The following methods are provided by the mixin classes, and provide easy overriding of the object save or deletion behavior. perform_create(self, serializer) - Called by CreateModelMixin when saving a new object instance. perform_update(self, serializer) - Called by UpdateModelMixin when saving an existing object instance. perform_destroy(self, instance) - Called by DestroyModelMixin when deleting an object instance. These hooks are particularly useful for setting attributes that are implicit in the request, but are not part of the request data. For instance, you might set an attribute on the object based on the request user, or based on a URL keyword argument. def perform_create(self, serializer): serializer.save(user=self.request.user) These override points are also particularly useful for adding behavior that occurs before or after saving an object, such as emailing a confirmation, or logging the update. def perform_update(self, serializer): instance = serializer.save() send_email_confirmation(user=self.request.user, modified=instance) You can also use these hooks to provide additional validation, by raising a ValidationError(). This can be useful if you need some validation logic to apply at the point of database save. For example: def perform_create(self, serializer): queryset = SignupRequest.objects.filter(user=self.request.user) if queryset.exists(): raise ValidationError('You have already signed up') serializer.save(user=self.request.user) Other methods: You won't typically need to override the following methods, although you might need to call into them if you're writing custom views using GenericAPIView. get_serializer_context(self) - Returns a dictionary containing any extra context that should be supplied to the serializer. Defaults to including 'request', 'view' and 'format' keys. get_serializer(self, instance=None, data=None, many=False, partial=False) - Returns a serializer instance. get_paginated_response(self, data) - Returns a paginated style Response object. paginate_queryset(self, queryset) - Paginate a queryset if required, either returning a page object, or None if pagination is not configured for this view. filter_queryset(self, queryset) - Given a queryset, filter it with whichever filter backends are in use, returning a new queryset.

      Generic Views in Django and Django REST Framework: Simplified Explanation

      What are Generic Views?

      Django's Generic Views: - Purpose: Generic views in Django are pre-built views designed to handle common patterns and tasks. Instead of writing repetitive code for common functionalities, you can use these views to save time. - Example: If you want to create a view to list all users or create a new user, instead of writing the code from scratch, you can use a generic view that already handles these tasks.

      Django REST Framework's (DRF) Generic Views: - Purpose: Similar to Django's generic views, DRF provides generic views for building API endpoints quickly and efficiently. These views map closely to your database models. - Example: If you need an API endpoint to list all users or create a new user via an API call, DRF's generic views can do this with minimal code.

      Benefits of Using Generic Views

      • Code Reusability: Generic views abstract common patterns, reducing the need to write repetitive code.
      • Simplicity: Makes it easy to set up views for standard operations like listing, creating, updating, or deleting records.
      • Customization: You can easily customize these views by setting class attributes or overriding methods.

      Basic Example of Using Generic Views

      Here’s a simple example of how to use a generic view in Django REST Framework to create a list and create user API:

      ```python from django.contrib.auth.models import User from myapp.serializers import UserSerializer from rest_framework import generics from rest_framework.permissions import IsAdminUser

      class UserList(generics.ListCreateAPIView): queryset = User.objects.all() # The queryset of all User objects serializer_class = UserSerializer # The serializer to use for input/output permission_classes = [IsAdminUser] # Only allow admin users to access this view ```

      Overriding Methods for Custom Behavior

      You can customize the behavior of these views by overriding methods. For example, to customize the list method:

      ```python class UserList(generics.ListCreateAPIView): queryset = User.objects.all() serializer_class = UserSerializer permission_classes = [IsAdminUser]

      def list(self, request):
          queryset = self.get_queryset()  # Use get_queryset() to get the data
          serializer = UserSerializer(queryset, many=True)
          return Response(serializer.data)
      

      ```

      Using Generic Views in URL Configuration

      You can directly use the generic view in your URL configuration:

      ```python from django.urls import path from rest_framework.generics import ListCreateAPIView from django.contrib.auth.models import User from myapp.serializers import UserSerializer

      urlpatterns = [ path('users/', ListCreateAPIView.as_view(queryset=User.objects.all(), serializer_class=UserSerializer), name='user-list') ] ```

      Important Attributes and Methods in GenericAPIView

      • queryset: The set of data that the view will operate on.
      • serializer_class: The class used to serialize and deserialize data.
      • lookup_field: The field used to look up individual model instances (default is 'pk').
      • pagination_class: The class used for paginating results.
      • filter_backends: List of classes used to filter the queryset.

      Common Methods:

      • get_queryset(self): Returns the queryset to use for the view.
      • get_object(self): Returns a single object instance for detail views.
      • filter_queryset(self, queryset): Filters the queryset based on the filter backends.

      Example of Customizing get_queryset

      If you want to customize the queryset based on the user making the request:

      python def get_queryset(self): user = self.request.user return user.accounts.all()

      In summary, Django and Django REST Framework's generic views provide a powerful and efficient way to handle common view patterns, making development faster and more maintainable. You can use, customize, and extend these views to suit the specific needs of your application.

    1. We would like to thank you and the reviewers for your thoughtful comments that assisted us to improve the manuscript. We carefully followed the reviewers’ recommendations and provide a detailed point-by-point account of our responses to the comments. 

      Please find below the important changes in the updated manuscript.

      (1) We changed the title according to the comments provided by reviewer #1.

      (2) We edited the introduction, results, and discussion to improve the link between the objectives of the study, the findings, and their discussion, as reviewer #2 recommended.

      (3) We clarified the link between camouflage and fitness, which is now presented as a hypothesis, as reviewer #1 suggested.

      (4) We added new analyses and figures in the main text and in the supplementary materials to better emphasize sex differences in landing force, foraging strategies and hunting success, following reviewer #1 suggestion.

      (5) According to reviewer #2 comments, we edited the results adding key information about methods to help the reader understand the findings without reading the Methods section.

      (6) We added important details about the model selection approach along with a discussion of the low R-square values reported in our analyses on hunting success, as reviewer #2 suggested.

      eLife assessment 

      This fundamental work substantially advances our understanding of animals' foraging behaviour, by monitoring the movement and body posture of barn owls in high resolution, in addition to assessing their foraging success. With a large dataset, the evidence supporting the main conclusions is convincing. This work provides new evidence for motion-induced sound camouflage and has broad implications for understanding predator-prey interactions. 

      Public Reviews: 

      Reviewer #1 (Public Review): 

      In this paper, Schalcher et al. examined how barn owls' landing force affects their hunting success during two hunting strategies: strike hunting and sit-and-wait hunting. They tracked tens of barn owls that raised their nestlings in nest boxes and utilized high-resolution GPS and acceleration loggers to monitor their movements. In addition, camcorders were placed near their nest boxes and used to record the prey they brought to the nest, thus measuring their foraging success. 

      This study generated a unique dataset and provided new insights into the foraging behavior of barn owls. The researchers discovered that the landing force during hunting strikes was significantly higher compared to the sit-and-wait strategy. Additionally, they found a positive relationship between landing force and foraging success during hunting strikes, whereas, during the sit-and-wait strategy, there was a negative relationship between the two. This suggests that barn owls avoid detection by generating a lower landing force and producing less noise. Furthermore, the researchers observed that environmental characteristics affect barn owls' landing force during sit-and-wait hunting. They found a greater landing force when landing on buildings, a lower landing force when landing on trees, and the lowest landing force when landing on poles. The landing force also decreased as the time to the next hunting attempt decreased. These findings collectively suggest that barn owls reduce their landing force as an acoustic camouflage to avoid detection by their prey. 

      The main strength of this work is the researchers' comprehensive approach, examining different aspects of foraging behavior, including high-resolution movement, foraging success, and the influence of the environment on this behavior, supported by impressive data collection. The weakness of this study is that the results only present a partial biological story contained within the data. The focus is on acoustic camouflage without addressing other aspects of barn owls' foraging strategy, leaving the reader with many unanswered questions. These include individual differences, direct measurements of owls' fitness, a detailed analysis of the foraging strategy of males and females, and the collective effort per nest box. However, it is possible that these data will be published in a separate paper. 

      We greatly appreciate your recognition of the comprehensive approach and extensive data collection. Our primary objective was to study the role of acoustic camouflage. Nonetheless, the manuscript now includes a detailed analysis of the foraging strategy and hunting success of males and females (lines 164-225).

      The results presented support the authors' conclusion that lower landing force during sit-andwait hunting increases hunting success, likely due to a decreased probability of detection by their prey, resulting in acoustic camouflage. The authors also argue that hunting success is crucial for survival, and thus, acoustic camouflage has a direct link to fitness. While this statement is reasonable, it should be presented as a hypothesis, as no direct evidence has been provided here.

      Thank you for the comment. We agree and thus have edited the language accordingly.  

      However, since information about nestling survival is typically monitored when studying behavior during the breeding period, the authors' knowledge of the effect of acoustic camouflage on owls' fitness can probably be provided. Furthermore, it will be interesting to further examine the foraging strategies used by different individuals during foraging, the joint foraging success of both males and females within each nest box, and the link between landing force and foraging success if the data are available.

      We are currently writing a manuscript on these topics. We are aware that several scientific questions regarding the foraging ecology of the barn owl still need our attention. Regarding the link between landing force and foraging success, we believe that our revised manuscript addresses this specific topic, please see specific responses below.

      However, even without this additional analysis on survival, this paper provides an unprecedented dataset and the first measurement of landing force during hunting in the wild. It is likely to inspire many other researchers currently studying animal foraging behavior to explore how animals' movements affect foraging success.

      Reviewer #2 (Public Review): 

      Summary: 

      The authors provide new evidence for motion-induced sound camouflage and can link the hunting approach to hunting success (detailing the adaptation and inferring a fitness consequence). 

      Strengths: 

      Strong evidence by combining high-resolution accelerometer data with a ground-truthed data set on prey provisioning at nest boxes. A good set of co-variates to control for some of the noise in the data provides some additional insights into owl hunting attempts. 

      Weaknesses: 

      There is a disconnect between the hypotheses tested and the results presented, and insufficient detail is provided on the statistical approach. R2 values of the presented models are very small compared to the significance of the effect presented. Without more detail, it is impossible to assess the strength of the evidence.

      In the revised manuscript, we changed the way results are presented and we improved the link between the hypotheses and the results. The R2 values are indeed small. It is however important to keep in mind that we are assessing the outcome of one specific behavior (i.e. landing force during sit-and-wait hunts) on hunting success in a wild environment, where many complex ecological interactions likely influence hunting success. Nonetheless, the coefficients (as reported in the results) show that for every 1 N increase in landing force, there is a 15% reduction in hunting success, which is substantial. In the discussion we also note that 50 Hz is a relatively low sampling frequency for estimating the peak ground reaction force. We have gone back over the presentation of our results and made our discussion more nuanced to acknowledge this aspect. 

      We have also added a detailed description about our model selection process in the methods section and provide a model selection table for each analysis in the supplementary materials.

      The authors seem to overcome persisting challenges associated with the validation and calibration of accelerometer data by ground-truthing on-board measures with direct observations in captivity, but here the methods are not described any further and sample sizes (2 owls - how many different loggers were deployed?) might be too small to achieve robust behavioural classifications.

      Thank you for the comment. Details of our methods of behavioural identification are provided in lines 385 – 429. There are two reasons why our results should not be limited by the sample size. First, we used the temporal sequence of changes in acceleration, and rates of change in acceleration data, which make the methods robust to individual differences in acceleration values. Furthermore, our methods for behavioural identification were not based on machine learning. Instead, we use a Boolean based approach (as described in Wilson et al. 2018. MEE), which is more robust to small differences in absolute values that might occur e.g. in relation to slight changes in device position. 

      Recommendation for the authors:

      Reviewer #1 (Recommendations For The Authors): 

      Comment 1. This study provides new insights into animals' foraging behavior and will probably inspire other researchers to examine foraging behavior in such high resolution.

      We hope so, thank you.

      Comment 2. However, it is necessary to describe better the measured landing force and the hunting strike and perching behavior so the readers can understand these methods when reading the results (and without reading the Methods).

      We have now changed the text in the “Results” to help the reader understand the key methods while reading the results.

      Comment 3. In addition, make sure you use the same terminology for hunting strategies during the entire paper and especially in all figures and corresponding result descriptions.

      We now use consistent terminology throughout the text and figures. We hope that this is now clear in the revised manuscript.

      Comment 4. In addition, although I find your statement about the link between acoustic camouflage and fitness reasonable, it should be described as a hypothesis or examined if you want to keep the direct link statement. I believe showing a direct link can add an additional outstanding aspect to this paper, but I also understand that it can be addressed in a separate paper.

      We agree that the relationship between hunting success and barn owl fitness is an important topic, but it necessitates a consideration of both hunting strategies, including hunting on the wing, which extends beyond the limits of our current study. Indeed, our primary objective was to conduct a detailed examination of the interplay between acoustic camouflage and the success of the sit-and-wait technique.

      However, we have edited the manuscript to explicitly describe the link between acoustic camouflage and fitness as a hypothesis. We believe this adjustment provides a more accurate representation of our approach. We hope this clarifies the specific emphasis of our work and its contribution to the understanding of barn owl hunting behavior.

      Here are my detailed comments about the paper: 

      Comment 5. Title: Consider changing the title to "Acoustic camouflage predicts hunting success in a wild predator." 

      We would like to thank you for your nice proposition. However, we opted for a different title, which is now “Landing force reveals new form of motion-induced sound camouflage in a wild predator”.

      Comment 6. Line 91-93: Please provide additional information about the collected dataset, including: 

      Description of the total period of observations, an average and standard deviation of perching and hunting attempt events per individual per night, number of foraging trips per individual per night, details about the geographic location and characteristics of the habitat, season, and reproductive state. 

      The revised manuscript now includes detailed information about the collected dataset (i.e. study area, reproductive state, etc…). “We used GPS loggers and accelerometers to record high resolution movement data during two consecutive breeding seasons (May to August in 2019 and 2020) from 163 wild barn owls (79 males and 84 females) breeding in nest boxes across a 1,000 km² intensive agricultural landscape in the western Swiss plateau.” Results section, lines 79 – 82

      Details about the number of foraging trips per individuals and per night are now presented in the results: “Sexual dimorphism in body mass was marked among our sampled individuals. Males were lighter than females (84 females, average body mass: 322 ± 22.6 g; 79 males, average body mass 281 ± 16.5 g, Fig S6) and provided almost three times more prey per night than females (males: 8 ± 5 prey per night; females: 3 ± 3 prey per night; Fig.S7). Males also displayed higher nightly hunting effort than females (Males: 46 ± 16 hunting attempts per night, n= 79; Females: 25 ± 11 hunting attempts per nights, n=84; Fig. 3A, Fig S8). However, females were more likely to use a sit and wait strategy than males (females: 24% ± 15%, males: 13% ± 10%, Fig.S9). As a result, the number of perching events per night was similar between males and females (Females: 76 ± 23 perching events per nights; Males: 69 ± 20 perching events per night; Fig S8).” (lines 165 – 174) 

      Comment 7. In addition, state if the information describes breeding pairs of males and females and provides statistics on the number of tracked pairs and the number of nest boxes.

      The revised manuscript now includes a description of the number of tracked breeding pairs and the number of nest boxes. “Of these individuals, 142 belonged to pairs for which data were recovered from both partners (71 pairs in total, 40 in 2019, 31 in 2020). The remaining 21 individuals belonged to pairs with data from one partner (11 females and 1 male in 2019; 4 females and 5 males in 2020).” (lines 82 – 85.)

      Comment 8. Line 93: Briefly define the term "landing force" and explain how it was measured (and let the reader know that there is a detailed description in the Methods).

      We now include a brief definition of the “landing force” along with a brief explanation of how it was measured in the results section. “We extracted the peak vectoral sum of the raw acceleration during each landing and converted this to ground reaction force (hereafter “landing force”, in Newtons) using measurements of individual body mass (see methods for detailed description).” (lines 92 – 95).

      Comment 9. Line 94: All definitions, including "pre-hunting force," need to be better described in the Results section.

      Thank you for this suggestion. We now provided a better description of those key definitions directly in the results section: 

      Measurement of landing force: “Barn owls employing a sit-and-wait strategy land on multiple perches before initiating an attack, with successive landings reducing the distance to the target prey (Fig. 2C). 

      We used the acceleration data to identify 84,855 landings. These were further categorized into perching events (n = 56,874) and hunting strikes (n = 27,981), depending whether barn owls were landing on a perch or attempting to strike prey on the ground (Fig. 1A and B, see methods for specific details on behavioral classification).” (lines 88 – 95)

      Pre-hunt perching force predicts hunting success: “Finally, we analyzed whether the landing force in the last perching event before each hunting attempt (i.e. pre-hunt perching force) predicted variation in hunting success” (lines 229 – 230)

      Comment 10. Line 102: Remove "Our analysis of 27,981 hunting strikes showed that" and add "n = 27,981" after the statistics. You have already stated your sample size earlier. There is no need to emphasize it again, although your sample size is impressive.

      We modified the text in the results section as suggested.

      Comment 11. Line 104: The results so far suggest that the difference in landing force between males and females is an outcome of their different body masses. However, it is not clear what is the reason for the difference in the number of hunting strike attempts between males and females (Lines 104-106). Can you compare the difference in landing force between males and females with similar body mass (females from the lower part of the distribution and males from the upper part)? Is there still a difference?

      Thank you, following your comment we made some new analyses that clarified the situation around landing force involved in perching and hunting strike events between sexes. But firstly, we wanted to clarify why there is a difference in number of hunting attempts between males and females. During the breeding season, females typically perform most of the incubation, brooding, and feeding of nestlings in the nest, while the male primarily hunts food for the female and chicks. The female supports the male providing food in a very irregular way, and this changes from pair to pair (paper in prep.). The differences in number of hunting attempts between males and females reflects this asymmetry in food provisioning between sexes during this specific period. We specified this in the revised version of the manuscript (lines 164 – 174). 

      We also provide a new analysis to investigate sex differences in mass-specific landing force (force/body mass). We found that males and females produce similar force per unit of body mass during perching events. This demonstrates that the overall higher perching force in females (see Fig. 4C in the manuscript) is therefore driven by their higher body mass. (lines 194 – 199)

      Comment 12. Line 154: I believe Boonman et al. (2018) is relevant to this part of the discussion. Boonman, Arjan, et al. found that barn owl noise during landing and taking off is worth considering. ["The sounds of silence: barn owl noise in landing and taking off."

      Behavioral Processes 157 (2018): 484-488.]

      We now cited this paper in the discussion.

      Comment 13. Line 164: Your results do not directly demonstrate a link to fitness, although they potentially serve as a proxy for fitness (add a reference). However, you might have information regarding nestlings' survival - that will provide a direct link for fitness. Change your statement or add the relevant data.

      We appreciated your feedback, and we adjusted the language accordingly.

      Comment 14. Line 213: If the poles are closer to the ground - is it possible that the higher trees and buildings serve for resting and gathering environmental information over greater distances? For example, identifying prey at farther distances or navigating to the next pole?

      Yes, this is indeed the most likely explanation for the fact that owls land more on buildings and trees than on poles until the last period (about 6 minutes) before hunting. In these last minutes, barn owls preferentially use poles, as we showed in figure 2B. The revised manuscript now includes this explanation in the discussion (lines 269 – 284).

      Comment 15. Line 250: The product "AXY-Trek loggers" does not appear on the Technosmart website (there are similar names, but not an exact match). Are you sure this is the correct name of the tracking device you used? 

      Thank you for pointing out this detail that we missed. The device we used is now called "AXY-Trek Mini" (https://www.technosmart.eu/axy-trek-mini/). We have corrected this error directly in the revised manuscript.

      Comment 16. Line 256: Please explain how the devices were recovered. Did you recapture the animals? If so, how? Additionally, replace "after approximately 15 days" with the exact average and standard deviation. Furthermore, since you have these data, please state the difference in body mass between the two measurements before and after tagging.

      The birds were recaptured to recover the devices. Adults barn owls were recaptured at their nest sites, again using automatic sliding traps that are activated when birds enter the nest box. The statement "after approximately 15 days" was replaced by the exact mean and standard deviation, which were 10.47 ± 2.27 days. Those numbers exclude five individuals from the total of 163 individuals included in this study. They could not be recaptured in the appropriate time window but were re-encountered when they initiated a second clutch later in the season (4 individuals) or a new clutch the year after (1 individual).

      We integrated this previously missing information in the revised manuscript (lines 370 – 372).

      Comment 17. Line 259: What was the resolution of the camera? What were the recording methods and schedule? How did you analyze these data? 

      The resolution was set to 3.1 megapixel. Motion sensitive camera traps were installed at the entrance to each nest box throughout the period when the barn owls were wearing data loggers, and each movement detected triggered the capture of three photos in bursts. The photos recorded were not analyzed as such for this study, but were used to confirm each supply of prey, which had previously been detected from the accelerometer data. We added these details in the revised manuscript (lines 377 – 380)

      Comment 18_1. Figure 1: 

      Panel A) Include the sex of the described individual. 

      The sex of the described individual is now included in the figure caption.

      Comment 18_2. It would be interesting to show these data for both males and females from the same nest box (choose another example if you don't have the data for this specific nest box). 

      Although we agree that showing tracks of males and females from the same nest is very interesting, the purpose of this figure was to illustrate our data annotation process and we believe that adding too many details on this figure will make it appear messy. However, the revised manuscript now includes a new figure (Fig. 3A) which shows simultaneous GPS tracks of a male and a female during a complete night, with detailed information about perching and hunting behaviors.

      Comment 18_3. Add the symbol of the nest box to the legend. 

      Done

      Comment 18_4. Provide information about the total time of the foraging trip in the text below. 

      The duration of the illustrated foraging trip has been included in the figure caption.

      Comment 18_5. To enhance the figure’s information on foraging behavior, consider color coding the trajectory based on time and adding a background representing the landscape. Since this paper may be of interest to researchers unfamiliar with barn owl foraging behavior, it could answer some common questions. 

      For similar reasons explained in our answer above (Comment 18_2), we would rather keep this figure as clean as possible. However, we followed your recommendations and included these details in the new Figure 3 described above. In this new figure, GPS tracks are color coded according to the foraging trip number and includes a background representing the landscape. To provide even more detail about the landscape, we added another figure in the supplementary materials (Fig. S2) which provides illustration of barn owls foraging ground and nest site that we think might be of interest for people unfamiliar with barn owls.

      Comment 18_6. Inset panels) provide a detailed description of the acceleration insert panels. 

      Done

      Comment 18_7. Color code the acceleration data with different colors for each axis, add x and y axes with labels, and ensure the time frame on the x-axis is clear. How was the self-feeding behavior verified (should be described in the methods section)? 

      We kept both inset panels as simple as possible since they serve here as examples, but a complete representation of these behaviors (with time frame, different colors and labels) is provided in the supplementary materials (figure S3). We included this statement in the figure caption and added a reference to the full representations from the supplementary materials: 

      In the Figure caption: “Inset panels show an example of the pattern of the tri-axial acceleration corresponding to both nest-box return and self-feeding behaviors (but see Fig S3for a detailed representation of the acceleration pattern corresponding to each behavior).” 

      In the Method section: “Self-feeding was evident from multiple and regular acceleration peaks in the surge and heave axes (resulting in peaks in VeDBA values > 0.2 g and < 0.9 g, Fig.S3D), with each peak corresponding to the movement of the head as the prey was swallowed whole.”.

      Comment 18_8. Panel B) Note in the caption that you refer to the acceleration z-axis.

      We believe that keeping the statement “the heave acceleration…” in the figure caption is more informative than referring to the “z-axis” as it describes the real dimension to which we are referring. The use of the x, y and z axes can be misleading as they can be interchanged depending on the type and setting of recorders used.

      Comment 18_9. Present the same time scale for both hunting strategies to facilitate comparison. You can achieve this by showing only part of the flight phase before perching. 

      Done

      Comment 18_10. Panel C) Presenting the data for both hunting strategy and sex would provide more comprehensive information about the results and would be relatively easy to implement. 

      We agree with your comment. We present the differences in landing force for both landing contexts and sexes in the new Figure 3 as well as in the supplementary materials (Figure S10) of this revised manuscript.

      Comment 19. Figure 2: Please provide an explanation of the meaning of the circles in the figure caption.  

      Done

      Comment 20. Figure 3: 

      Panel A) It is unclear how the owl illustration is relevant to this specific figure, unlike the previous figures where it is clear. Also, suggest removing the upper black line from the edge of the figure or add a line on the right side. 

      Done (now in Figure 2).

      Panel B) "Density" should be capitalized. 

      Done

      Panel C) Add a scale in meters, and it would be helpful to include an indication of time before hunting for each data point. 

      Done

      Comment 21. Figure S1: Mark the locations of the nest boxes and ensure that trajectories of different individuals and sexes can be identified. 

      The purpose of this figure was to show the spatial distribution of the data. We think that adding nest locations and coloring the paths according to individuals and/or sex will make the figure less clear. However, the new Figure 3 highlights those details.

      Comment 22. Figure S2: Show the pitch angle similarly to how you showed the acceleration axes, and explain what "VeDBA" stands for. Provide a description of the perching behavior, clearly indicating it on the figure. Add axes (x, y, z) to the illustration of the acceleration explanation. 

      We edited this figure (now figure S3) to show the pitch angle and provide an explanation of what “VeDBA” stands for in the figure caption. The figure caption now also provides a better description of the perching behavior. For the axes (i.e. X, Y, Z), we prefer to refer to the heave, surge, and sway as this is more informative and refers to what is usually reported in studies working with tri-axial accelerometers.

      Comment 23. Table S1: Improve the explanation in the caption and titles of the table. 

      Done

      Reviewer #2 (Recommendations For The Authors): 

      Comment 1. From the public review and my assessment there, the authors can be assured that I thoroughly enjoyed the read and am looking forward to seeing a revised and improved version of this paper. 

      We thank the reviewer for this comment. We revised the manuscript according to their comments.

      Comment 2. In addition to my major points stated above, I would like to add the following recommendations: 

      The manuscript is overall well written, but it uses a very pictorial language (a little as if we were in a David Attenborough documentary) that I find inappropriate for a research paper (especially in the abstract and introduction, "remarkable" (2x), "sophisticated" (are there any unsophisticated adaptations? We are referring to something under selection after all) etc.

      We appreciated that you found the paper overall well written, and we understand the comment about pictorial language. We therefore slightly changed the text to make sure that the adjective used to describe adaptive strategies are not over-emphasized.

      Comment 3. Abstract 

      "While the theoretical benefits of predator camouflage are well established, no study has yet been able to quantify its consequences for hunting success." - This claim is actually not fully true: 

      Nebel Carina, Sumasgutner Petra, Pajot Adrien and Amar Arjun 2019: Response time of an avian prey to a simulated hawk attack is slower in darker conditions, but is independent of hawk colour morph. Soc. open sci.6:190677 

      We edited our claim to specify that the consequences of predator camouflage on hunting success has never been quantified in natural conditions and cited the reference in the introduction.

      Comment 4. Line 23. Rephrase to: "We used high-resolution movement data to quantify how barn owls (Tyto alba) conceal their approach when using a sit-and-wait strategy, as well as the power exerted during strikes." 

      We edited this sentence in the abstract, as suggested.

      Comment 5. Results 

      There is a disconnect between the objectives outlined at the end of the introduction and the following results that should be improved. 

      The authors state: "Using high-frequency GPS and accelerometer data from wild barn owls (Tyto alba), we quantify the landing dynamics of this sit-and-wait strategy to (i) examine how birds adjust their landing force with the behavioral and environmental context and (ii) test the extent to which the magnitude of the predator cue affects hunting success." But one of the first results presented are sex differences. 

      This is a fair point. We have now changed our statement in the end of the introduction as well as the order of the results to improve the link between the objectives outlined in the introduction and the way result are presented. 

      Comment 6. At this stage, the reader does not even know yet that we are presented with a size-dimorphic species that also has very different parental roles during the breeding season. This should be better streamlined, with an extra paragraph in the introduction. And these sex differences are then not even discussed, so why bring them up in the first place (and not just state "sex has been fitted as additional co-variate to account for the size-dimorphism in the species" without further details). 

      We edited the way the objectives are outlined in the introduction to cover the size dimorphism (lines 70 – 76). We also completely changed the way the sex differences are presented in the results, including a new analysis that we believe provides a better comprehensive understanding of barn owl foraging behavior (lines 164 – 206). Finally, we added a new paragraph in the discussion to consider those results (lines 319 – 339).

      Comment 7. It is not clear to me where and how high-resolution GPS data were used? The results seem to concentrate on ACC – why GPS was used and how it features should be foreshadowed in a few lines in the introduction. I definitively prefer having the methods at the end of a manuscript, but with this structure, it is crucial to give the reader some help to understand the storyline. 

      GPS data were used to validate some behavioral classifications (prey provisioning for example), but most importantly they were used to link each landing event with perch types. We edited the text in the result section to clarify where GPS and/or ACC data were used.

      Comment 8. Discussion 

      Move the orca example further down, where more detail can be provided to understand the evidence. 

      After our extensive edits in the discussion, we felt this example was interrupting the flow. We now cite this study in the introduction. 

      Comment 9. Size dimorphism and evident sex differences are not discussed. 

      The revised manuscript now includes a new paragraph in the discussion in which sex differences are discussed (lines 319 – 339).

      Comment 10. Be more precise in the terminology used (for example, land use seems to be interchangeable with habitat characteristics?). 

      We modified “land use” with “habitat data” in the revised manuscript.

      Comment 11. Methods 

      Please provide a justification for the very high weight limit (5%; line 256). This limit is outdated and does not fulfill the international standard of 3% body weight. I assume the ethics clearance went through because of the short nature of the study (i.e., the birds were not burdened for life with the excess weight? But a line is needed here or under the ethics considerations to clarify this). 

      The 5% weight limit was considered acceptable due to the short deployment period, and we now edited the ethics statement to emphasize this point. However, it is important to note that there is no real international standard, with both 3% and 5% weight limits being commonly used. Both limits are arbitrary and the impact of a fixed mass on a bird varies with species and flight style. All owls survived and bred similarly to the non-tagged individuals in the population (lines 373 – 376 & lines 558 – 561)

      EDITORIAL COMMENT: We strongly encourage you to provide further context and clarification on this issue, as suggested by the Reviewer. On a related point, the ethics statement refers to GPS loggers, rather than GPS and ACC devices; we encourage you to clarify wording here.

      Thank you for highlighting this point that indeed needed some clarifications.

      Although we have used the terminology "GPS recorders", the authorization granted by the Swiss authorities for this study effectively covers the entire tracking system, which combines both GPS and ACC recorders in the same device. We have therefore changed the wording used in the ethics statement to avoid any misunderstanding (lines 373 – 376 & lines 558 – 561)

      Comment 12. Please provide more information on the model selection approach, what does "Non-significant terms were dropped via model simplification by comparing model AIC with and without terms." mean? Did the authors use a stepwise backward elimination procedure (drop1 function)? Or did they apply a complete comparison of several candidate models? I think a model comparison approach rather than stepwise selection would be more informative, as several rather than only one model could be equally probable. This might also improve model weights or might require a model averaging procedure - current reported R2values are very small and do not seem to support the results well. 

      We apologize for the lack of details about this important aspect of the statistical analysis. We applied an automated stepwise selection using the dredge function from the R package “MuMin”, therefore applying a complete comparison of several candidate models. The final models were chosen as the best models since the number of candidate models within ∆AIC<2 was relatively low in each analysis and thus a model averaging was not appropriate here. We edited the methods section to ensure clarity, and added model selection tables for each analysis, ranked according to AICc scores, in the supplementary materials (lines 532 – 552)

      In addition, we agree that the reported R-squared values in our analyses are quite low, specifically regarding the influence of pre-hunt perching force on hunting success (cond R2 = 0.04). Nonetheless, landing impact still has a notable effect size (an increase of 1N reduces hunting success by 15%). The reported values are indicative of the inherent complexity in studying hunting behavior in a wild setting where numerous variables come into play. We specifically investigated the hypothesis that the force involved during pre-hunt landings, and consequently the emitted noise, influences the success of the next hunting attempt in wild barn owls. Factors such as prey behavior and micro-habitat characteristics surrounding prey (such as substrate type and vegetation height) are most likely to be influential but hard, or nearly impossible, to model. We now cover this in a more nuanced way in the discussion (lines 266 – 268)

      Comment 13. Please explain why BirdID was nested in NightID - this is not clear to me.

      Probably here there is a misunderstanding because we wrote that we nested NightID in BirdID (and not BirdID in NightID). 

      Comment 14. I hope the final graphs and legends will be larger, they are almost impossible to read. 

      We enlarged the graphs and legends as much as possible to improve readability. However, looking at the graphs in the published version they seem clear and readable.

      Comment 15. Figure S1: Does "representation" mean the tracks don't show all of the 163 owls? If so, be precise and tell us how many are illustrated in the figure. 

      Figure S1 represent the tracks for each of the 163 barn owls used in the study. We changed the terminology used in the figure caption to avoid any misunderstanding.

      Comment 16. Figure S4: Please adjust the y-axis to a readable format. 

      Done

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript aims at a quantitative model of how visual stimuli, given as time-dependent light intensity signals, are transduced into electrical currents in photoreceptors of macaque and mouse retina. Based on prior knowledge of the fundamental biophysical steps of the transduction cascade and a relatively small number of free parameters, the resulting model is found to fairly accurately capture measured photoreceptor currents under a range of diverse visual stimuli and with parameters that are (mostly) identical for photoreceptors of the same type.

      Furthermore, as the model is invertible, the authors show that it can be used to derive visual stimuli that result in a desired, predetermined photoreceptor response. As demonstrated with several examples, this can be used to probe how the dynamics of phototransduction affect downstream signals in retinal ganglion cells, for example, by manipulating the visual stimuli in such a way that photoreceptor signals are linear or have reduced or altered adaptation. This innovative approach had already previously been used by the same lab to probe the contribution of photoreceptor adaptation to differences between On and Off parasol cells (Yu et al, eLife 2022), but the present paper extends this by describing and testing the photoreceptor model more generally and in both macaque and mouse as well as for both rods and cones.

      Strengths:

      The presentation of the model is thorough and convincing, and the ability to capture responses to stimuli as different as white noise with varying mean intensity and flashes with a common set of model parameters across cells is impressive. Also, the suggested approach of applying the model to modify visual stimuli that effectively alter photoreceptor signal processing is thought-provoking and should be a powerful tool for future investigations of retinal circuit function. The examples of how this approach can be applied are convincing and corroborate, for example, previous findings that adaptation to ambient light in the primate retina, as measured by responses to light flashes, mostly originates in photoreceptors.

      Weaknesses:

      In the current form of the presentation, it doesn't become fully clear how easily the approach is applicable at different mean light levels and where exactly the limits for the model inversion are at high frequency. Also, accessibility and applicability by others could be strengthened by including more details about how parameters are fixed and what consensus values are selected.

      Thank you - indeed a central goal of writing this paper was to provide a tool that could be easily used by other laboratories. We have clarified and expanded four points in this regard: (1) we have stated more clearly that mean light levels are naturally part of inversion process, and hence the approach can be applied across a broad range of light levels (lines 292-297); (2) we have expanded our analysis of the high frequency limits to the inversion and added that expanded figure to the main text (new Fig 5); (3) we have included additional detail about our calibration procedures, including our calibration code, to facilitate transfer to other labs; and, (4) we have detailed the procedure for identification of consensus parameters (line 172-182, 191-199 and Methods section starting on line 831).

      Reviewer #2 (Public Review):

      Summary:

      This manuscript proposes a modeling approach to capture nonlinear processes of photocurrents in mammalian (mouse, primate) rod and cone photoreceptors. The ultimate goal is to separate these nonlinearities at the level of photocurrent from subsequent nonlinear processing that occurs in retinal circuitry. The authors devised a strategy to generate stimuli that cancel the major nonlinearities in photocurrents. For example, modified stimuli would generate genuine sinusoidal modulation of the photocurrent, whereas a sinusoidal stimulus would not (i.e., because of asymmetries in the photocurrent to light vs. dark changes); and modified stimuli that could cancel the effects of light adaptation at the photocurrent level. Using these modified stimuli, one could record downstream neurons, knowing that any nonlinearities that emerge must happen post-photocurrent. This could be a useful method for separating nonlinear mechanisms across different stages of retinal processing, although there are some apparent limitations to the overall strategy.

      Strengths:

      (1) This is a very quantitative and thoughtful approach and addresses a long-standing problem in the field: determining the location of nonlinearities within a complex circuit, including asymmetric responses to different polarities of contrast, adaptation, etc.

      (2) The study presents data for two primary models of mammalian retina, mouse, and primate, and shows that the basic strategy works in each case.

      (3) Ideally, the present results would generalize to the work in other labs and possibly other sensory systems. How easy would this be? Would one lab have to be able to record both receptor and post-receptor neurons? Would in vitro recordings be useful for interpreting in vivo studies? It would be useful to comment on how well the current strategy could be generalized.

      We agree that generalization to work in other laboratories is important, and indeed that was a motivation for writing this as a methods paper. The key issue in such generalization is calibration. We have expanded our discussion of our calibration procedures and included that code as part of the github repository associated with the paper. Figure 10 (previously Figure 9) was added to illustrate generalization. We believe that the approach we introduce here should generalize to in vivo conditions. We have expanded the text on these issues in the Discussion (sections starting on line 689 and 757).

      Weaknesses:

      (1) The model is limited to describing photoreceptor responses at the level of photocurrents, as opposed to the output of the cell, which takes into account voltage-dependent mechanisms, horizontal cell feedback, etc., as the authors acknowledge. How would one distinguish nonlinearities that emerge at the level of post-photocurrent processing within the photoreceptor as opposed to downstream mechanisms? It would seem as if one is back to the earlier approach, recording at multiple levels of the circuit (e.g., Dunn et al., 2006, 2007).

      Indeed the current model is limited to a description of rod and cone photocurrents. Nonetheless, the transformation of light inputs to photocurrents can be strongly nonlinear, and such nonlinearities can be difficult to untangle from those occurring late in visual processing. Hence, we feel that the ability to capture and manipulate nonlinearities in the photocurrents is an important step. We have expanded Figure 10 to show an additional example of how manipulation of nonlinearities in phototransduction can give insight into downstream responses. We have also noted in text that an important next step would be to include inner segment mechanisms (section starting on line 661); doing so will require not only characterization of the current-to-voltage transformation, but also horizontal cell feedback and properties of the cone output synapse.

      (2) It would have been nice to see additional confirmations of the approach beyond what is presented in Figure 9. This is limited by the sample (n = 1 horizontal cell) and the number of conditions (1). It would have been interesting to at least see the same test at a dimmer light level, where the major adaptation mechanisms are supposed to occur beyond the photoreceptors (Dunn et al., 2007).

      We have added an additional experiment to this figure (now Figure 10) which we feel nicely exemplifies the approach. The approach that we introduce here really only makes sense at light levels where the photoreceptors are adapting; at lower light levels the photoreceptors respond near-linearly, so our “modified” and “original” stimuli as in Figure 10 (previously Figure 9) would be very similar (and post-phototransduction nonlinearities are naturally isolated at these light levels).

      Reviewer #3 (Public Review):

      Summary:

      The authors propose to invert a mechanistic model of phototransduction in mouse and rod photoreceptors to derive stimuli that compensate for nonlinearities in these cells. They fit the model to a large set of photoreceptor recordings and show in additional data that the compensation works. This can allow the exclusion of photoreceptors as a source of nonlinear computation in the retina, as desired to pinpoint nonlinearities in retinal computation. Overall, the recordings made by the authors are impressive and I appreciate the simplicity and elegance of the idea. The data support the authors' conclusions but the presentation can be improved.

      Strengths:

      -  The authors collected an impressive set of recordings from mouse and primate photoreceptors, which is very challenging to obtain.

      -  The authors propose to exploit mechanistic mathematical models of well-understood phototransduction to design light stimuli that compensate for nonlinearities.

      -  The authors demonstrate through additional experiments that their proposed approach works.

      Weaknesses:

      -  The authors use numerical optimization for fitting the parameters of the photoreceptor model to the data. Recently, the field of simulation-based inference has developed methods to do so, including quantification of the uncertainty of the resulting estimates. Since the authors state that two different procedures were used due to the different amounts of data collected from different cells, it may be worthwhile to rather test these methods, as implemented e.g. in the SBI toolbox (https://joss.theoj.org/papers/10.21105/joss.02505). This would also allow them to directly identify dependencies between parameters, and obtain associated uncertainty estimates. This would also make the discussion of how well constrained the parameters are by the data or how much they vary more principled because the SBI uncertainty estimates could be used.

      Thank you - we have improved how we describe and report parameter values in several ways. First, the previous text erroneously stated that we used different fitting procedures for different cell types - but the real difference was in the amount of data and range of stimuli we had available between rods and cones. The fitting procedure itself was the same for all cell types. We have clarified this along with other details of the model fitting both in the main text (lines 121-130) and in the Methods (section starting on line 832). We also collected parameter values and estimates of allowed ranges in two tables. Finally, we used sloppy modeling to identify parameters that could covary with relatively small impact on model performance; we added a description of this analysis to the Methods (section starting on line 903).

      -  In several places, the authors refer the reader to look up specific values e.g. of parameters in the associated MATLAB code. I don't think this is appropriate, important values/findings/facts should be in the paper (lines 142, 114, 168). I would even find the precise values that the authors measure interesting, so I think the authors should show them in a figure/table. In general, I would like to see also the average variance explained by different models summarized in a table and precise mean/median values for all important quantities (like the response amplitude ratios in Figures 6/9).

      We have added two tables with these parameters values and estimates of allowable ranges. We also added points to show the mean (and SD) across cells to the population figures and added those numerical values to the figure legends throughout.

      -  If the proposed model is supposed to model photoreceptor adaptation on a longer time scale, I fail to see why this can be an invertible model. Could the authors explain this better? I suspect that the model is mainly about nonlinearities as the authors also discuss in lines 360ff.

      For the stimuli that we use we see little or no contribution of slow adaptation in phototransduction. We have expanded the description of this point in the text and referred to Angueyra et al (2022) which looks at this issue in more detail for primate cones (paragraph starting on line 280).

      -  The important Figures 6-8 are very hard to read, as it is not easy to see what the stimulus is, the modified stimulus, the response with and without modification, what the desired output looks like, and what is measured for part B. Reworking these figures would be highly recommended.

      We have reworked all of the figures to make the traces clearer.

      -  If I understand Figure 6 correctly, part B is about quantifying the relative size of the response to the little first flash to the little second flash. While clearly, the response amplitude of the second flash is only 50% for the second flash compared to the first flash in primate rod and cones in the original condition, the modified stimulus seems to overcompensate and result in 130% response for the second flash. How do the authors explain this? A similar effect occurs in Figure 9, which the authors should also discuss.

      Indeed, in those instances the modified stimulus does appear to overcompensate. We suspect this is due to differences in sensitivity of the specific cells probed for these experiments and those used in the model construction. We now describe this limitation in more detail (lines 524-526). A similar point comes up for those experiments in which we speed the photoreceptor responses (new FIgure 9B), and we similarly note that the cells used to test those manipulations differed systematically from those used to fit the model (lines 558-560).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I only have a few minor questions and suggestions for clarification.

      It hasn't become fully clear to me how general the model is when different mean light levels (on long-time scales) are considered. Are there slow adaptation processes not captured in the model that affect model performance? And how should one go about setting the mean light level when, for example, probing ganglion cells with a stimulus obtained through model inversion? Should it work to add an appropriate DC component to the current that is provided as input to the inverted model? (Presumably, deriving a stimulus and then just adding background illumination should not work, or could this be a good approximation, given a steady state that is adapted to the background?)

      We have clarified in the main text that slow adaptation does not contribute substantially to responses to the range of stimuli we explored (lines 281-289). We have also clarified that the stimulus in the model inversion is specified in isomerizations per second - so the mean value of the stimulus is automatically included in the model inversion (lines 293-298).

      Furthermore, a caveat for the model inversion seems to be the potential amplification of high-frequency noise. The suggested application of a cutoff temporal frequency seems appropriate, but data are shown only for a few example cells. Is this consistent across cells? (Given that performance between, e.g., mouse cones can vary considerably according to Fig. 4B?) I would also like to suggest moving the corresponding Supplemental Figure (4.1) into the main part of the manuscript, as it seems quite important.

      We have added population analysis to the new Figure 5 (which was Figure 4 - Figure Supplement 1). We have also clarified that the amplification of high frequency noise is an issue only when we try to apply model inversion to measured stimuli. When we use model inversion to identify stimuli that elicit desired responses, the target responses are computed from a linear model that has no noise, so this is not a concern in applications like those in Figures 6-10.

      Also, could the authors explain more clearly what the effect of the normalization of the estimated stimulus by the power of the true stimulus is? Does this simply reduce power at high frequency or also affect frequencies below the suggested cutoff (where the stimulus reconstruction should presumably be accurate even without normalization)?

      Indeed this normalization reduces high frequency power and has little impact on low frequencies where the inversion is accurate; this is now noted in the text (line 363). As for amplification of high frequency noise (previous comment), the normalization by the stimulus power is only needed when inverting measured responses (i.e. responses with noise) and is omitted when we are identifying stimuli that elicit desired responses (e.g. in Figures 6-10).

      While the overall performance of the model to predict photoreceptor currents is impressive, it seems that particular misses occur for flashes right after a step in background illumination and for the white-noise responses at low background illumination (e.g. Figure 1B). Is that systematic, and if so what might be missing in the model?

      Indeed the model (at least with fixed parameters across stimuli) appears to systematically miss a few aspects of the photoreceptor responses. These include the latency of the response to a bright flash and the early flashes in the step + flash protocol in Figure 1B. Model errors for the variable mean noise stimulus (Figure 2) showed little dependence on time even when responses were sorted by mean light level and by previous mean level. Model errors did not show a clear systematic dependence on light level; this likely reflects, at least in part, the use of mean-square-error to identify model parameters. We have expanded our discussion of these systematic errors in the text (lines 164-166).

      I was also wondering whether this is related to the fact that in Figure 9B, the gain in the modified condition is actually systematically higher when there is more background light. Do the authors think that this could be a real effect or rather an overcompensation from the model? (By the way, is it specified what "Delta-gain" really is, i.e., ratio or normalized difference?)

      We suspect this is an issue with the sensitivity of the specific cells for which we did these experiments (i.e. variability in the gamma parameter between cells). This sensitivity varies between cells, and such variations are likely to place the strongest limitation on our ability to use this approach to manipulate responses in different retinas. We now note those issues in the Results (lines 523-526, 557-559 and 591-593) with reference to Figures 9 (previously Figure 8) and 10 (previously Figure 9), and describe this limitation more generally in the Discussion (section starting on line 649). We have also changed delta-gain to response ratio, which seemed more intuitive.

      Maybe I missed this, but it seems that the parameter gamma is fitted in a cell-type-specific fashion (e.g. line 163), but then needs to be fixed for held-out cells. How was this done? Is there much variability of gamma between cells?

      There is variability in gamma between cells, and this likely explains some of systematic differences between data and model (see above and Methods, lines 902-903). For the consensus models in Figure 2B, gamma was allowed to vary for each cell while the remaining consensus model parameters were fixed. Gamma was set equal to the mean value across cells for model inversion (i.e. for all of the analyses in Figures 4-10). We have described the fitting procedure in considerably more detail in the revised Methods (starting on line 832).

      For completeness, it would be nice to have the applied consensus model parameters in the manuscript rather than just in the Matlab code (especially since the code has not been part of the submission). Also, some notes on how the numerical integration of the differential equations was done would be nice (time step size?).

      We have added tables with consensus parameters and estimates of the sensitivity of model predictions to each parameter. We have also added additional details about the numerical approaches (including the time step) to Methods.

      Similarly, it would be nice to explicitly see the relationships that are used to fix certain model parameters (lines 705ff). And can the constants k and n (lines 709-710) be assumed identical for different species and receptor types?

      We have added more details to the model fitting to the methods, including the use of steady-state conditions to hold certain parameters fixed (lines 862 and 866). We are not aware of any direct comparisons of k and n across species and receptor types. We have noted that model performance was not improved by modest changes in these parameters (due to compensation by other model parameters). More generally, we have explained how some parameters trade for others and hence the logic of fixing some even when exact values were not available.

      For the previous measurements of m and beta (lines 712-713), is there a reference or source?

      We have added references for these values.

      Did the authors check for differences in the model parameters between cone types (e.g., S vs. M)?

      We did not include S cones here. They are harder to record from and collecting a fairly large data set across a range of stimuli would be challenging. Our previous work shows that S cones have slower responses than L and M cones, and this would certainly be reflected in differences in model parameters. We have noted this in the text (Methods, line 808-810).

      For the stated flash responses time-to-peak (lines 183-184), is this for a particular light intensity with no background illumination?

      Those are flashes from darkness - now noted in the text.

      Figure 2 - Supplement 1 doesn't have panel labels A and B, unlike the legend.

      Fixed - thank you.

      Reviewer #2 (Recommendations For The Authors):

      (1) Fig. 2B - for some cells, the consensus model seems to fit better than the individual model. How is this possible?

      This was mostly an error on our part (we inadvertently included responses to more stimuli in fitting the individual models, which slightly hampered their performance). Even with this correction, however, a few cells remain for which the consensus model outperforms and individual model. We believe this is because there is more data to constrain model parameters for the consensus models (since they are fit to all cells at the same time), and that can compensate for improvements associated with customizing parameters to specific cells.

      (2) Fig. 2 Supplement 1, it would be useful to see a blow-up of the data in an inset, as in Fig. 2B.

      Thanks - added.

      (3) Line 400 - this paragraph could include additional quantification and statistics to back up claims re 'substantially reduced', 'considerably lower'.

      We quantify that in the next sentence by computing the mean-square-error between responses and sinusoidal fits (also in Figure 7B, which now includes statistics as well). We have made that connection more direct in the text.

      (4) Maybe a supplement to Fig. 8 could show the changes to the stimulus required to alter the kinetics in both directions - to give more insight into part B., especially.

      Good suggestion - we have added the stimuli to all of the panels of the figure (now Figure 9).

      (5) Fig. 8B - in 'Speed response up' condition - there seems to be error in the model for the decay time of the response - especially for the 'original' condition, which is not quantified in 8C. Was it generally difficult to predict responses to flashes?

      That seems largely to reflect that the cells used for those experiments had faster initial kinetics than the average cells (responses to the control traces are also faster than model predictions in these cells - black traces in Figure 9B). We have added this to the text.

      (6) Line 678, possibly notes that 405 nm equally activates S and M photopigments in mice, since most of the cones co-express the two photopigments (Rohlich et al., 1994; Applebury et al., 2000; Wang et al., 2011).

      Thanks - we have added this (lines 827-829).

      (7) The discussion could include a broader description of the various approaches to identifying nonlinearities within retinal circuitry, which include (incomplete list): recording at multiple levels of the circuit (e.g., Kim and Rieke 2001; Rieke, 2001; Baccus and Meister, 2002; Dunn et al., 2006; 2007; Beaudoin et al., 2007; Baccus et al., 2008); recording currents vs. spiking responses in a ganglion cell (e.g., Kim and Rieke, 2001; Zaghloul et al., 2005; Cui et al., 2016); neural network modeling approaches (e.g., Maheswaranathan et al., 2023); optogenetic approaches to studying filtering/nonlinear behavior at synapses (e.g., Pottackal et al., 2020; 2021).

      Good suggestion - we have added this to the final paragraph of the Discussion.

      Reviewer #3 (Recommendations For The Authors):

      -  I am personally not a fan of the style: "... as Figure 4A shows..." or comparable and much prefer a direct "We observe that X is the case (Figure 4A)". If the authors agree, they may want to revise their paper in this way.

      We have revised the text to avoid the “... as Figure xx shows” construction. We have retained multiple instances which follow a “Figure xx shows that …” construction (which is both active rather than passive and does not use a personal pronoun).

      -  I am not a fan of the title. Light-adaption clamp caters only to a very specialized audience.

      We have changed the title to “Predictably manipulating photoreceptor light responses to reveal their role in downstream visual responses.”

      -  The parameter fitting procedure should not only be described in Matlab code, but in the paper.

      Thanks - we have expanded this in the Methods considerably (section starting on line 832).

      -  The authors should elaborate on why different fitting procedures were used.

      We did not describe that issue clearly. The fitting procedures used across cells were identical, but we had different data available for different cell types due to experimental limitations. We have substantially revised that part of the main text to clarify this issue (paragraph starting on line 121).

      -  The authors state in line 126 that the input stimulus is supposed to mimic eye movements mouse, monkey, or human? Please clarify.

      Thanks - we have changed this sentence to “abrupt and frequent changes in intensity that characterize natural vision.”

      -  Please improve the figure style. For example, labels should be in consistent capitalization and ideally use complete words (e.g. Figure 2B, 4B, and others).

      We have made numerous small changes in the figures to make them more consistent.

      -  Is the fraction of variance calculated on held-out-data? Linear models should be added to Figure 2B.

      The fraction of variance explained was not calculated on held out data because of limitations in the duration of our recordings. Given the small number of free parameters, and the ability of the model to capture held out cells, we believe that the model generalizes well. We have added a supplemental figure with linear model performance (Figure 2 - Figure Supplement 2).

      -  Fig. 9A is lacking bipolar cell and amacrine cell labels. Currently, it looks like HC is next to the BC in the schematic.

      Thanks - we have updated that figure (now Figure 10A)

      -  Maybe I am misunderstanding something, but it seems like the linear model prediction shown in Figure 2A for the rod could be easily improved by scaling it appropriately. Is this impression correct or why not?

      We have clarified how the linear model is constructed (by fitting the linear model to low contrast responses of the full model at the mean stimulus intensity). We also added a supplemental figure, following the suggestion above, showing the linear model performance when a free scaling factor is included for each cell.

      -  The verification experiment in Fig. 5 is only anecdotal and is elaborated only in Figure 6. If I am not mistaken, this does not necessitate its own figure/section but could rather be merged.

      We have kept this figure separate (now Figure 6) as we felt that it was important to highlight the approach in general in a figure before getting into quantification of how well it works.

      -  Figure 5 right is lacking labels. What is red and grey?

      Thanks for catching that - labels are added now.

      -  The end of the Discussion is slightly unusual. Did some text go missing?

      Thanks - we have rearranged the Discussion so as not to end on Limitations.

      -  There is a bonus figure at the end which seems also not to belong in the manuscript.

      Thanks - the bonus figure is removed now.

      -  The methods should also describe briefly what kind of routines were used in the Matlab code, e.g. gradient descent with what optimizer?

      We’ve added that information as well.

    1. Snippets

      Nice snippets but need to be converted to LuaSnip. Also some of them (such as code block) are better derived from markdown plugins aka mkdx. Some are even better implemented in a non-snippet plugins (images, tables)

    1. Author response:

      Reviewer #1 (Public Review):

      Weaknesses:

      There are some minor weaknesses.

      Notably, there are not a lot of new insights coming from this paper. The structural comparisons between MCC and PCC have already been described in the literature and there were not a lot of significant changes (outside of the exo- to endo- transition) in the presence vs. absence of substrate analogues.

      We agree that the structures of the human MCC and PCC holoenzymes are similar to their bacterial homologs. That is due to the conserved sequences and functions of MCC and PCC across different species.

      There is not a great deal of depth of analysis in the discussion. For example, no new insights were gained with respect to the factors contributing to substrate selectivity (the factors contributing to selectivity for propionyl-CoA vs. acetyl-CoA in PCC). The authors state that the longer acyl group in propionyl-CoA may mediate stronger hydrophobic interactions that stabilize the alpha carbon of the acyl group at the proper position. This is not a particularly deep analysis and doesn't really require a cryo-EM structure to invoke. The authors did not take the opportunity to describe the specific interactions that may be responsible for the stronger hydrophobic interaction nor do they offer any plausible explanation for how these might account for an astounding difference in the selectivity for propionyl-CoA vs. acetyl-CoA. This suggests, perhaps, that these structures do not yet fully capture the proper conformational states.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      The authors also need to be careful with their over-interpretation of structure to invoke mechanisms of conformational change. A snapshot of the starting state (apo) and final state (ligand-bound) is insufficient to conclude *how* the enzyme transitioned between conformational states. I am constantly frustrated by structural reports in the biotin-dependent enzymes that invoke "induced conformational changes" with absolutely no experimental evidence to support such statements. Conformational changes that accompany ligand binding may occur through an induced conformational change or through conformational selection and structural snapshots of the starting point and the end point cannot offer any valid insight into which of these mechanisms is at play.

      Point accepted. We will revise our manuscript to use "conformational differences" instead of "conformational changes" to describe the differences between the apo and ligand-bound states.

      Reviewer #2 (Public Review):

      Comments and questions to the manuscripts:

      I'm quite impressed with the protein purification and structure determination, but I think some functional characterization of the purified proteins should be included in the manuscript. The activity of enzymes should be the foundation of all structures and other speculations based on structures.

      We appreciate this comment. However, since we purified the endogenous BDCs and the sample we obtained was a mixture of four BDCs, the enzymatic activity of this mixture cannot accurately reflect the catalytic activity of PCC or MCC holoenzyme. We will acknowledge this limitation in the discussion section of our revised manuscript.

      In Figure 1B, the structure of MCC is shown as two layers of beta units and two layers of alpha units, while there is only one layer of alpha units resolved in the density maps. I suggest the authors show the structures resolved based on the density maps and show the complete structure with the docked layer in the supplementary figure.

      We appreciate this comment. We have shown the cryo-EM maps of the PCC and MCC holoenzymes in fig. S8 to indicate the unresolved regions in these structures. The BC domains in one layer of MCCα in the MCC-apo structure were not resolved. However, we think it would be better to show a complete structure in Fig. 1 to provide an overall view of the MCC holoenzyme. We will revise Fig. 1B and the figure legend to clearly point out which domains were not resolved in the cryo-EM map and were built in the structure through docking.

      In the introduction, I suggest the author provide more information about the previous studies about the structure and reaction mechanisms of BDCs, what is the knowledge gap, and what problem you will resolve with a higher resolution structure. For example, you mentioned in line 52 that G437 and A438 are catalytic residues, are these residues reported as catalytic residues or this is based on your structures? Has the catalytic mechanism been reported before? Has the role of biotin in catalytic reactions revealed in previous studies?

      Point accepted. It was reported that G419 and A420 in S. coelicolor PCC, corresponding to G437 and A438 in human PCC, were the catalytic residues (PMID: 15518551). The same study also reported the catalytic mechanism of the carboxyl transfer reaction. The role of biotin in the BDC-catalyzed carboxylation reactions has been extensively studied (PMIDs: 22869039, 28683917). We will include these information in the introduction section of our revised manuscript.

      In the discussion, the authors indicate that the movement of biotin could be related to the recognition of acyl-CoA in BDCs, however, they didn't observe a change in the propionyl-CoA bound MCC structure, which is contradictory to their speculation. What could be the explanation for the exception in the MCC structure?

      We appreciate this comment. We do not have a good explanation for why we did not observe a change in the propionyl-CoA bound MCC structure. It is noteworthy that neither acetyl-CoA nor propionyl-CoA is the natural substrate of MCC. Recently, a cryo-EM structure of the human MCC holoenzyme in complex with its natural substrate, 3-methylcrotonyl-CoA, has been resolved (PDB code: 8J4Z). In this structure, the binding site of biotin and the conformation of the CT domain closely resemble that in our acetyl-CoA-bound MCC structure. Therefore, the movement of biotin induced by acetyl-CoA binding mimics that induced by the binding of MCC's natural substrate, 3-methylcrotonyl-CoA, indicating that in comparison with propionylCoA, acetyl-CoA is closer to 3-methylcrotonyl-CoA regarding its ability to bind to MCC. We will discuss this possibility in our revised manuscript.

      In the discussion, the authors indicate that the selectivity of PCC to different acyl-CoA is determined by the recognition of the acyl chain. However, there are no figures or descriptions about the recognition of the acyl chain by PCC and MCC. It will be more informative if they can show more details about substrate recognition in Figures 3 and 4.

      We appreciate this comment. Unfortunately, in the cryo-EM maps of the PCC holoenzymes, the acyl groups were not resolved (fig. S6), so we were unable to analyze the specific interactions between the acyl-CoAs and PCC. We will discuss this limitation in our revised manuscript.

      How are the solved structures compared with the latest Alphafold3 prediction?

      Since AlphaFold3 was not released when our manuscript was submitted, we did not compare the solved structures with the AlphaFold3 predictions. We have now carried out the predictions using Alphafold3. Due to the token limitation of the AlphaFold3 server, we can only include two α and six β subunits of human PCC or MCC in the prediction. The overall assembly patterns of the Alphafold3-predicted structures are similar to that of the cryo-EM structures. The RMSDs between PCCα, PCCβ, MCCα, and MCCβ in the apo cryo-EM structures and those in the AlphaFold3-predicted structures are 7.490 Å, 0.857 Å, 7.869 Å, and 1.845 Å, respectively. The PCCα and MCCα subunits adopt an open conformation in the cryo-EM structures but adopt a closed conformation in the AlphaFold-3 predicted structures, resulting in large RMSDs.

    1. Reviewer #2 (Public Review):

      Summary:

      This work addresses a puzzling finding in the viral forecasting literature: high-frequency viral variants evince signatures of neutral dynamics, despite strong evidence for adaptive antigenic evolution. The authors explicitly model interactions between the dynamics of viral adaptations and of the environment of host immune memory, making a solid theoretical and simulation-based case for the essential role of host-pathogen eco-evolutionary dynamics. While the work does not directly address improved data-driven viral forecasting, it makes a valuable conceptual contribution to the key dynamical ingredients (and perhaps intrinsic limitations) of such efforts.

      Strengths:

      This paper follows up on previous work from these authors and others concerning the problem of predicting future viral variant frequency from variant trajectory (or phylogenetic tree) data, and a model of evolving fitness. This is a problem of high impact: if such predictions are reliable, they empower vaccine design and immunization strategies. A key feature of this previous work is a "traveling fitness wave" picture, in which absolute fitnesses of genotypes degrade at a fixed rate due to an advancing external field, or "degradation of the environment". The authors have contributed to these modeling efforts, as well as to work that critically evaluates fitness prediction (references 11 and 12). A key point of that prior work was the finding that fitness metrics performed no better than a baseline neutral model estimate (Hamming distance to a consensus nucleotide sequence). Indeed, the apparent good performance of their well-adopted "local branching index" (LBI) was found to be an artifact of its tendency to function as a proxy for the neutral predictor. A commendable strength of this line of work is the scrutiny and critique the authors apply to their own previous projects. The current manuscript follows with a theory and simulation treatment of model elaborations that may explain previous difficulties, as well as point to the intrinsic hardness of the viral forecasting inference problem.

      This work abandons the mathematical expedience of traveling fitness waves in favor of explicitly coupled eco-evolutionary dynamics. The authors develop a multi-compartment susceptible/infected model of the host population, with variant cross-immunity parameters, immune waning, and infectious contact among compartments, alongside the viral growth dynamics. Studying the invasion of adaptive variants in this setting, they discover dynamics that differ qualitatively from the fitness wave setting: instead of a succession of adaptive fixations, invading variants have a characteristic "expiring fitness": as the immune memories of the host population reconfigure in response to an adaptive variant, the fitness advantage transitions to quasi-neutral behavior. Although their minimal model is not designed for inference, the authors have shown how an elaboration of host immunity dynamics can reproduce a transition to neutral dynamics. This is a valuable contribution that clarifies previously puzzling findings and may facilitate future elaborations for fitness inference methods.

      The authors provide open access to their modeling and simulation code, facilitating future applications of their ideas or critiques of their conclusions.

      Weaknesses:

      The current modeling work does not make direct contact with data. I was hoping to see a more direct application of the model to a data-driven prediction problem. In the end, although the results are compelling as is, this disconnect leaves me wondering if the proposed model captures the phenomena in detail, beyond the qualitative phenomenology of expiring fitness. I would imagine that some data is available about cross-immunity between strains of influenza and sarscov2, so hopefully some validation of these mechanisms would be possible.

      After developing the SIR model, the authors introduce an effective "expiring fitness" model that avoids the oscillatory behavior of the SIR model. I hoped this could be motivated more directly, perhaps as a limit of the SIR model with many immune groups. As is, the expiring fitness model seems to lose the eco-evolutionary interpretability of the SIR model, retreating to a more phenomenological approach. In particular, it's not clear how the fitness decay parameter nu and the initial fitness advantage s_0 relate to the key ecological parameters: the strain cross-immunity and immune group interaction matrices.

    1. Editors Assessment:

      This paper presents a new tool to make using PhysiCell easier, which is an open-source, physics-based multicellular simulation framework with a very wide user base. PhysiCell Studio is a graphical tool that makes it easier to build, run, and visualize PhysiCell models. Over time, it has evolved from being a GUI to include many additional functionalities, and can be used as desktop and cloud versions. This paper outlines the many features and functions, the design and development process behind it, and deployment instructions. Peer review improved the organisation of the various repositories and adding both a requirements.txt and environment.yml files. Looking to the future the developers are planning to add new features based on community feedback and contributions, and this paper presents the many code repositories if readers wish to contribute to the development process.

      This evaluation refers to version 1 of the preprint

    2. AbstractDefining a multicellular model can be challenging. There may be hundreds of parameters that specify the attributes and behaviors of objects. Hopefully the model will be defined using some format specification, e.g., a markup language, that will provide easy model sharing (and a minimal step toward reproducibility). PhysiCell is an open source, physics-based multicellular simulation framework with an active and growing user community. It uses XML to define a model and, traditionally, users needed to manually edit the XML to modify the model. PhysiCell Studio is a tool to make this task easier. It provides a graphical user interface that allows editing the XML model definition, including the creation and deletion of fundamental objects, e.g., cell types and substrates in the microenvironment. It also lets users build their model by defining initial conditions and biological rules, run simulations, and view results interactively. PhysiCell Studio has evolved over multiple workshops and academic courses in recent years which has led to many improvements. Its design and development has benefited from an active undergraduate and graduate research program. Like PhysiCell, the Studio is open source software and contributions from the community are encouraged.

      This work has been published in GigaByte Journal under a CC-BY 4.0 license (https://doi.org/10.46471/gigabyte.128), and has published the reviews under the same license. This is part of the PhysiCell Ecosystem Series: https://doi.org/10.46471/GIGABYTE_SERIES_0003

      Reviewer 1. Meghna Verma:

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined?

      The authors have provided links for video descriptions for installation and that is appreciated.

      One overall recommendation is: If all the screenshots (for e.g.: from Fig 1-12 of the main paper and all the subsections in Supplementary) can be combined in one figure that will help enhance the complete overview and the overall flow of the paper.

      Additional comments are available here: https://gigabyte-review.rivervalleytechnologies.comdownload-api-file?ZmlsZV9wYXRoPXVwbG9hZHMvZ3gvVFIvNTA3L1Jldmlld19QaHlzaUNlbGxTdHVkaW9fTVYucGRm

      Reviewer 2. Koert Schreurs and Lin Wouters supervised by Inge Wortel

      Is there a clear statement of need explaining what problems the software is designed to solve and who the target audience is?

      The problem statement is addressed in the introduction, which mentions the need for a GUI tool as a much more accessible way to edit the XML-based model syntax. However, it is somewhat confusing who exactly the intended audience of the paper is. Is the paper targeted at researchers that already use PhysiCell, but might want to switch to the GUI version? Or should it (also) target the potential new user-base of researchers interested in using ABMs, for whom the XML version was not sufficiently accessible and who will now gain access to these models because there is a GUI? Specifying the intended audience might impact some sections of the paper. For example, for users who already use PhysiCell, the step-by-step tutorials might not be useful since they would already know most of the available options; they would just need a quick overview of what info is in which tab. But if the paper is (also) targeted at potential new users, then some additional information could make both the paper and the tool much more accessible, such as:
      
      • A clear comparison to other modeling frameworks and their functionalities. Why should they use PhysiCell instead of one of the other available (GUI) tools? For example, the referenced Morpheus, CC3D and Artistoo all focus on a different model framework (CPMs); this might be worth mentioning. And what about Chaste? Does it represent different types of models, or are there other reasons to consider PhysiCell over Chaste or vice versa? For new users, this would be important information to include. The paper currently also does not mention other frameworks except those that offer a GUI. While the main point of the paper is the addition of the GUI, for completeness sake it might still be good to mention a broader overview of ABM frameworks and how they compare to PhysiCell, or simply to refer to an existing paper that provides such an overview.
      • The current tutorial immediately dives into very specific instructions (what to click and exact values to enter), often without explaining what these options mean or do. New users would probably appreciate to get a rough outline of which types of processes can be modelled, and which steps they would take to do so. This could be as easy as summarising the different main tabs before going into the details. I understand that some of these explanations will overlap with the main PhysiCell software – but considering that the GUI will open up modelling to a different type of community, it might make sense to outline them here to get a self-contained overview of functionality.
      • Indeed, if the above information is provided, the detailed tutorial might fit better as an appendix or in online documentation. That would also leave more space to explain not only which values to enter, but also what these variables do, why choose these values, what other options to consider, etc. Having this information together in one place would be very useful for beginning users.

      Is the source code available, and has an appropriate Open Source Initiative license been assigned to the code?

      The software is available under the GPL v3 licence.

      As Open Source Software are there guidelines on how to contribute, report issues or seek support on the code?

      There is a Github repository, ensuring that it is possible to contribute and report issues, and the paper explicitly invites community contributions. However, although the paper mentions that it is possible to seek support through Github Issues and “Slack channels”, we could find no link to the latter resource. This should probably be added to make this resource usable for the reader (or otherwise the statement should be removed)

      Is installation/deployment sufficiently outlined in the paper and documentation, and does it proceed as outlined?

      Mostly yes, as installation and deployment are outlined in the paper and documentation. However, we did notice a couple of issues: - The studio guide explains how to compile a project in PhysiCell (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), but does not mention that Mac users need to specify the g++ version at the top of the Makefile. This is explained in a separate blog (http://www.mathcancer.org/blog/setting-up-gcc-openmp-on-osx-homebrew-edition/) but should be outlined (or at least referenced) here as well. - There are several different resources covering the installation process, referring to e.g. github.com/physicell-training, github.com/PhysiCell-Tools/Studio-Guide, and the abovementioned blog. But this might not be very accessible to all users targeted by the new GUI functionality (especially when command line interventions and manual Makefile edits are involved). While not all of this has to be changed before publication, having all information in one place would already improve accessibility to a larger user-base. - When following the instructions (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), “python studio/bin/studio.py -p -e virus-sample” the -p flag gives an error: “Invalid argument(s): [‘-p’]”. We assumed it has to be left out, but perhaps the docs have to be updated.

      Is the documentation provided clear and user friendly?

      Mostly yes, as there is already a lot of documentation available. However, the user-friendliness could be improved with some minor changes. For example, the documentation could be made more user-friendly if resources were available from a central spot. Currently, information can be found in different places: - https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md provides installation instructions and a nice overview of what is where in the GUI, but as mentioned above, does not mention potential issues when installing on MacOS. - The paper provides very detailed examples; these might be nice to include along with the abovementioned overview. - Potentially other places as well. It would be great if the main documentation page could at least link to these other resources with a brief description of what the user will find there. Further, some additions would make the documentation more complete: - It would be good to have an overview somewhere of all the configuration files that can be supplied/loaded (e.g. those for “rules” and for initial configurations). - A clearer instruction/small tutorial on how to use simularium and paraview with physicell studio; especially for paraview there is no instruction on how to use your own data or make your own `.pvsm` file In the longer term, it might be worthwhile to set up a self-contained documentation website (this is relatively easy nowadays using e.g. Github pages), which can outline dependencies, installation instructions, a quick overview, detailed tutorials, example models, links to Github issues/slack communities. This is not a requirement for publication but might be worth looking into in the future as it would be more user-friendly.
      

      Is there a clearly-stated list of dependencies, and is the core functionality of the software documented to a satisfactory level?

      No. The core functionality of the software is nicely outlined in the Github README (https://github.com/PhysiCell-Tools/Studio-Guide/blob/main/README.md), but as mentioned before, this high-level overview is missing in the paper itself. The README and paper recommend installing the Anaconda python distribution to get the required python dependencies. This is fine, but adding a setup file or requirements.txt might still be useful for users who are more familiar with python and want a more minimal installation. Providing a conda environment.yml that allows running the studio along with paraview and/or simularium might also be helpful. Note that running the studio with simularium in anaconda did not work because anaconda did not have the required vtk v9.3.0; instead we had to install simularium without anaconda (“pip3 install simularium”).

      Are there (ideally real world) examples demonstrating use of the software?

      The detail tutorial nicely walks the reader through the tool (although as mentioned before, a high-level overview is missing and the level of detail feels slightly out of place in the paper itself). When walking through the example in the paper and the supplementary, we did run into a few (minor) issues: - It might be good to stress explicitly that after copying the template.xml into tumor_demo.xml, the first step is always to compile using “make”. The paper mentions “Assuming … you have compiled the template project executable (called “project”) …”. But it might not be immediately clear to all users how exactly they should do so (presumably by running “make tumor_demo” after copying the xml file?). - When running “python studio/bin/studio.py -c tumor_demo.xml -e project” as instructed, a warning pops up that “rules0.csv” is not valid (although the tool itself still works). - The instructions for plotting say to press “enter” when changing cmin and cmax, but Mac offers only a return key. Pressing fn+return to get the enter functionality also does not work; it might be good to offer an alternative for Mac. - When reproducing the supplementary tutorial, results were slightly different. It might be good if the example would offer a random seed so that users can verify that they can reproduce these results exactly. In our hands, when reproducing figs 39, 40, 48, 49 yields way more (red) macrophages (even when running multiple times), but we could not be sure if this is due to variation between runs, or a mistake in the settings somewhere.
      
      
      The paper mentions that they have started setting up automated testing, but it does not give an idea of what the current test coverage is. Did they add a few tests here and there, or start to systematically test all parts of the software? I understand the latter might not be achievable immediately, but it would be good if users and/or contributors can at least get a sense of how good the current coverage is. (Note: the framework uses pytest, which seems to offer some functionality to generate coverage reports, see e.g. https://www.lambdatest.com/blog/pytest-code-coverage-report/). The code in studio_for_pytest.py has a comment “do later, otherwise problems sometimes”, but it is not entirely clear if the relevant issue has been resolved.
      

      Additional Comments: The presented tool offers a GUI interface to the PhysiCell framework for agent-based modeling. As outlined for the paper, this offers significant value to the users since editing a model is now much more accessible. The tool comes with extensive functionality and instructions. Overall, the tool functions as advertised, and will be of great value to the community of PhysiCell users that now have to edit XML files by hand. It is therefore (mostly) publishable as is if some of the issues with installation (mentioned above) can be straightened out. That said, we do think some improvements could make both the tool and the paper more accessible to a larger user audience. Most of these have been mentioned in the other questions, but we will list some additional ones below. Note that many of these are just suggestions, so we will leave it up to the authors if and when they implement them.

      Suggestions for the paper: While the paper nicely outlines design ideas and usage of the tool, there were some points where we felt that the main point did not quite come across, for example: - As mentioned in the question about problem statement and intended audience, adding some information to the paper would make it a more useful resource to users not yet familiar with PhysiCell (see remarks there). - The section “Design and development” describes the development history of the tool. In principle this is a valuable addition, because it illustrates how the project is under ongoing development and has already been improved several times based on feedback of users. However, the amount of information on each previous stage is slightly confusing; it is not entirely clear how this relates to the paper and current tool. If the main point is to showcase that the current tool has been built based on practical user experiences, this would probably come across better if this section was somewhat shorter and focused on the design choices rather than previous versions. If the main point is something else, it should be clarified what the main idea is. – The point of Table 1 was unclear to us – consider removing or explaining the main idea. - Several figures do not have captions (e.g. Figure 1 but also others); it would be helpful to clarify what message the figure should convey. – P4 “adjust the syntax for Windows if necessary” – is it self-explanatory how users should adjust? Consider adding the correct code for windows as well if possible, since users that want to use the GUI tool might not be familiar with command line syntax. - P6 “if you create your own custom C++ code referring directly to cell type ID” – this functionality is never discussed. This might be part of the general PhysiCell functionality, but it would be good to at least provide a link to a resource on how you could do this. - P8 “Only those parameters that display … editing the C++ code” – it was not entirely clear to me what this means, could you clarify? - P13 mentions you can immediately see changes to the model parameters made. This is very useful for prototyping when users want immediate feedback. However, what happens when you try to save output for a simulation where parameters were changed while the simulation was running? Would users be reminded that their current output is not representative? - Discussion: it is good to mention that the tool is already being used. Can you give an indication based on your experience how long it takes new users to learn to navigate the tool? This might be useful information to add in the paper. - The last statement on LLMs seems to come out of nowhere. Consider leaving it out or expanding further on what would be needed to make this work/how feasible this is.

      Further comments on the tool itelf: - The paper mentions that results may not be fully reproducible if multiple threads are used (I assume this is the case even when a random seed is set). In this case, would it make sense to throw a warning the first time a user tries to set a seed with multiple threads, to avoid confusion as to why the results are not reproducible? - Unusable fields are not always greyed out to indicate that they are disabled, which sometimes makes it seem as though the tool is unresponsive. In other places unusable options are set to grey, so it might be good to double-check if this is consistent. - At the initial conditions (IC) page there is no legend; it might be good to add one. - There are some small inconsistencies between the field names mentioned in the paper and those in the tool/screenshots. For example “boundary condition” (p5) should be “dirichlet BC”, “uptake” (p6) should be “uptake rate”. For the latter, the paper mentions that the length scale is 100 micron but this should be visible in the tool as well. - Not all fields have labels, so it is not always clear what the options do (see e.g. drop-downs in Figure 6). – There are a few points in the tool where you have to “enable” a functionality before it works, but this might not always be intuitive. For example, if you upload a file with initial conditions, it can be assumed that you want to use it. There might be good reasons for this in some cases but in general, consider if all these checkpoints are necessary or if this could be simplified. Same goes for the csv files that have to be saved separately instead of through the main “save” button – in the long term it might be worth saving all relevant files when they are updated, or at least throwing a warning that you have to save some of them separately.

    1. このようなことは通常、プロジェクトの初期の段階では望ましい動作を整理している最中のため発生します。またコードはまだ他の人に使用されていません。

      ちょっとわかりにくいと感じました。

      原文

      This usually happens in the early stages of a project when desired behavior is still being sorted out, and no one is using your code yet.

      代案

      このようなことは通常、プロジェクトが初期の段階で望ましい動作を整理している最中であり、かつ誰もまだ使い始めていないときに発生します。

    1. Response() Signature: Response(data, status=None, template_name=None, headers=None, content_type=None) Unlike regular HttpResponse objects, you do not instantiate Response objects with rendered content. Instead you pass in unrendered data, which may consist of any Python primitives. The renderers used by the Response class cannot natively handle complex datatypes such as Django model instances, so you need to serialize the data into primitive datatypes before creating the Response object. You can use REST framework's Serializer classes to perform this data serialization, or use your own custom serialization. Arguments: data: The serialized data for the response. status: A status code for the response. Defaults to 200. See also status codes. template_name: A template name to use if HTMLRenderer is selected. headers: A dictionary of HTTP headers to use in the response. content_type: The content type of the response. Typically, this will be set automatically by the renderer as determined by content negotiation, but there may be some cases where you need to specify the content type explicitly.

      Certainly! Here's a simplified explanation and notes about the Response() class in Django REST framework:

      Response Class in Django REST Framework

      • Purpose: The Response() class in Django REST framework is used to send data back to clients in various formats, such as JSON or HTML, based on what the client requests.

      • Usage: Unlike regular HttpResponse objects in Django, you don't give Response() class rendered content directly. Instead, you provide it with unrendered data, typically Python data types like lists or dictionaries.

      • Serialization: The Response() class cannot handle complex data types directly, like Django model instances. You need to convert these complex types into simpler data types (serialization) before passing them to Response().

      • Data Serialization: Use Django REST framework's Serializer classes to convert complex data (like Django models) into Python primitives (like dictionaries). This prepares the data for the Response() object to handle.

      • Arguments:

      • data: Serialized data (Python primitives) that will be sent in the response.
      • status: HTTP status code for the response (defaults to 200 for OK). It tells the client whether the request was successful or had an error.
      • template_name: Optional HTML template name to use if rendering HTML responses.
      • headers: Additional HTTP headers to include in the response.
      • content_type: The type of content in the response (usually set automatically based on content negotiation).

      Examples

      1. Sending JSON Data: ```python from rest_framework.response import Response from rest_framework.decorators import api_view

      @api_view(['GET']) def get_books(request): books = [{'title': 'Book 1', 'author': 'Author A'}, {'title': 'Book 2', 'author': 'Author B'}] return Response(books) `` - Here,Response()` is used to send a list of books as JSON data.

      1. Handling Error Responses: ```python from rest_framework.response import Response from rest_framework import status

      def create_book(request): if request.method == 'POST': # Some logic to create a book if book_created_successfully: return Response({'message': 'Book created successfully'}, status=status.HTTP_201_CREATED) else: return Response({'error': 'Failed to create book'}, status=status.HTTP_400_BAD_REQUEST) `` - In this example,Response()` is used to send messages about the success or failure of creating a book, along with appropriate HTTP status codes.

      Notes

      • Flexibility: Response() allows your Django API to respond with data in different formats based on client needs.
      • Serialization: Use serializers to convert complex data into formats Response() can handle.
      • HTTP Status Codes: Always consider setting appropriate HTTP status codes to inform clients about the success or failure of their requests.

      Using Response() in Django REST framework simplifies handling API responses, ensuring data is sent back to clients in the right format with proper status information.

    2. — Django documentation REST framework supports HTTP content negotiation by providing a Response class which allows you to return content that can be rendered into multiple content types, depending on the client request. The Response class subclasses Django's SimpleTemplateResponse. Response objects are initialised with data, which should consist of native Python primitives. REST framework then uses standard HTTP content negotiation to determine how it should render the final response content. There's no requirement for you to use the Response class, you can also return regular HttpResponse or StreamingHttpResponse objects from your views if required. Using the Response class simply provides a nicer interface for returning content-negotiated Web API responses, that can be rendered to multiple formats. Unless you want to heavily customize REST framework for some reason, you should always use an APIView class or @api_view function for views that return Response objects. Doing so ensures that the view can perform content negotiation and select the appropriate renderer for the response, before it is returned from the view.

      Sure, let's break down what that means:

      HTTP Content Negotiation: This is the process where a server and a client agree on the format of data that will be exchanged in an HTTP request. It's like deciding on the language in which two people will communicate.

      Response Class: In REST framework (used with Django), the Response class helps you send data back to clients in different formats (like JSON, HTML, etc.) based on what the client requests.

      Example: Imagine you have an endpoint /api/books/ that lists books. When a client (like a web browser or mobile app) asks for this list, they might want the data in JSON format (which is common for APIs), while another client might prefer HTML (for displaying in a web browser).

      Using Response Class: Instead of manually crafting the response each time, you can use the Response class provided by REST framework. It makes it easier to handle different formats. For instance, if a client requests JSON, the Response class can automatically convert your Python data (like lists of books) into JSON format.

      Why Use It: By using the Response class, you ensure that your API can easily respond with data in the format that the client prefers, whether it's JSON, HTML, or another format. It simplifies your code and makes your API more flexible for different clients.

      When to Use It: Unless you have specific reasons not to, it's recommended to use the Response class in your views that handle API requests. This way, REST framework can handle content negotiation smoothly, ensuring the right format is sent back to the client without you having to handle all the details manually.

      In summary, HTTP content negotiation and the Response class in REST framework help you efficiently manage how data is sent and received between your Django application and its clients, ensuring flexibility and ease of use.

    1. Trade-offs between views vs ViewSets Using ViewSets can be a really useful abstraction. It helps ensure that URL conventions will be consistent across your API, minimizes the amount of code you need to write, and allows you to concentrate on the interactions and representations your API provides rather than the specifics of the URL conf. That doesn't mean it's always the right approach to take. There's a similar set of trade-offs to consider as when using class-based views instead of function-based views. Using ViewSets is less explicit than building your API views individually.

      Trade-offs Between Views and ViewSets

      When deciding whether to use views or ViewSets in Django REST framework, it's important to understand the benefits and drawbacks of each approach. Both have their own use cases and can be more suitable in different scenarios.

      Views

      Pros: 1. Explicit and Customizable: You have full control over each view. This allows you to handle complex logic and special cases more easily. 2. Fine-grained Control: Allows you to define exactly what each view does, making it easier to optimize performance and security for specific endpoints. 3. Simplicity: For small projects or APIs with a limited number of endpoints, views might be simpler to implement and understand.

      Cons: 1. Repetitive Code: You might end up writing a lot of boilerplate code, especially if your API has many endpoints with similar logic. 2. Inconsistent URL Patterns: It's easier to accidentally create inconsistencies in your API's URL patterns and behavior if you're manually defining each endpoint. 3. Maintenance: As your project grows, maintaining a large number of individual views can become cumbersome and error-prone.

      ViewSets

      Pros: 1. Consistency: Ensures that URL patterns and behavior are consistent across your API, following standard REST conventions. 2. Less Boilerplate: Reduces the amount of code you need to write. Common CRUD operations are automatically handled. 3. Easier Refactoring: Grouping related views into a single ViewSet can make it easier to refactor and maintain your code. 4. DRY Principle: Helps to keep your code DRY (Don't Repeat Yourself), reducing redundancy.

      Cons: 1. Less Explicit: Abstracts away some of the details, which can make the behavior of your API less explicit and harder to understand at a glance. 2. Customization: While ViewSets are great for standard CRUD operations, they can be less flexible for endpoints that require complex or custom behavior. 3. Learning Curve: For developers new to Django REST framework, understanding the additional layer of abstraction might take some time.

      When to Use Views vs. ViewSets

      • Use Views When:
      • You need fine-grained control over each endpoint.
      • Your API has complex or custom behavior that doesn't fit the standard CRUD operations.
      • You're building a small project with only a few endpoints.
      • You want the explicitness and clarity of defining each view individually.

      • Use ViewSets When:

      • You want to minimize boilerplate code for standard CRUD operations.
      • Consistency across your API is a priority.
      • Your API has many endpoints that follow standard REST conventions.
      • You prefer to focus on the high-level design of your API rather than the specifics of URL configuration.

      Example Scenario

      Views Approach: If you're building a small API with a few endpoints that require complex custom behavior, you might choose to define each view individually. This approach gives you full control over each endpoint and makes the logic explicit.

      ```python class SnippetList(APIView): def get(self, request, format=None): snippets = Snippet.objects.all() serializer = SnippetSerializer(snippets, many=True) return Response(serializer.data)

      def post(self, request, format=None):
          serializer = SnippetSerializer(data=request.data)
          if serializer.is_valid():
              serializer.save(owner=request.user)
              return Response(serializer.data, status=status.HTTP_201_CREATED)
          return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
      

      ```

      ViewSets Approach: For a larger API with many standard CRUD operations, using ViewSets can save a lot of time and reduce boilerplate code. It ensures that your API follows consistent URL patterns and behavior.

      ```python class SnippetViewSet(viewsets.ModelViewSet): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly, IsOwnerOrReadOnly]

      @action(detail=True, renderer_classes=[renderers.StaticHTMLRenderer])
      def highlight(self, request, *args, **kwargs):
          snippet = self.get_object()
          return Response(snippet.highlighted)
      

      ```

      Router Configuration: Using a router simplifies URL configuration, reducing the risk of inconsistencies and making it easier to manage a large number of endpoints.

      ```python router = DefaultRouter() router.register(r'snippets', SnippetViewSet) router.register(r'users', UserViewSet)

      urlpatterns = [ path('', include(router.urls)), ] ```

      By considering these trade-offs, you can choose the approach that best fits the needs of your project and team.

    2. from rest_framework import permissions from rest_framework import renderers from rest_framework.decorators import action from rest_framework.response import Response class SnippetViewSet(viewsets.ModelViewSet): """ This ViewSet automatically provides `list`, `create`, `retrieve`, `update` and `destroy` actions. Additionally we also provide an extra `highlight` action. """ queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly, IsOwnerOrReadOnly] @action(detail=True, renderer_classes=[renderers.StaticHTMLRenderer]) def highlight(self, request, *args, **kwargs): snippet = self.get_object() return Response(snippet.highlighted) def perform_create(self, serializer): serializer.save(owner=self.request.user) This time we've used the ModelViewSet class in order to get the complete set of default read and write operations. Notice that we've also used the @action decorator to create a custom action, named highlight. This decorator can be used to add any custom endpoints that don't fit into the standard create/update/delete style. Custom actions which use the @action decorator will respond to GET requests by default. We can use the methods argument if we wanted an action that responded to POST requests. The URLs for custom actions by default depend on the method name itself. If you want to change the way url should be constructed, you can include url_path as a decorator keyword argument. Binding ViewSets to URLs explicitly The handler methods only get bound to the actions when we define the URLConf. To see what's going on under the hood let's first explicitly create a set of views from our ViewSets. In the snippets/urls.py file we bind our ViewSet classes into a set of concrete views. from rest_framework import renderers from snippets.views import api_root, SnippetViewSet, UserViewSet snippet_list = SnippetViewSet.as_view({ 'get': 'list', 'post': 'create' }) snippet_detail = SnippetViewSet.as_view({ 'get': 'retrieve', 'put': 'update', 'patch': 'partial_update', 'delete': 'destroy' }) snippet_highlight = SnippetViewSet.as_view({ 'get': 'highlight' }, renderer_classes=[renderers.StaticHTMLRenderer]) user_list = UserViewSet.as_view({ 'get': 'list' }) user_detail = UserViewSet.as_view({ 'get': 'retrieve' }) Notice how we're creating multiple views from each ViewSet class, by binding the HTTP methods to the required action for each view. Now that we've bound our resources into concrete views, we can register the views with the URL conf as usual. urlpatterns = format_suffix_patterns([ path('', api_root), path('snippets/', snippet_list, name='snippet-list'), path('snippets/<int:pk>/', snippet_detail, name='snippet-detail'), path('snippets/<int:pk>/highlight/', snippet_highlight, name='snippet-highlight'), path('users/', user_list, name='user-list'), path('users/<int:pk>/', user_detail, name='user-detail') ]) Using Routers Because we're using ViewSet classes rather than View classes, we actually don't need to design the URL conf ourselves. The conventions for wiring up resources into views and urls can be handled automatically, using a Router class. All we need to do is register the appropriate view sets with a router, and let it do the rest. Here's our re-wired snippets/urls.py file. from django.urls import path, include from rest_framework.routers import DefaultRouter from snippets import views # Create a router and register our ViewSets with it. router = DefaultRouter() router.register(r'snippets', views.SnippetViewSet, basename='snippet') router.register(r'users', views.UserViewSet, basename='user') # The API URLs are now determined automatically by the router. urlpatterns = [ path('', include(router.urls)), ] Registering the ViewSets with the router is similar to providing a urlpattern. We include two arguments - the URL prefix for the views, and the view set itself. The DefaultRouter class we're using also automatically creates the API root view for us, so we can now delete the api_root function from our views module.

      Let's break down how to refactor our views using ViewSets and Routers and ensure that everything is wired correctly.

      Step-by-Step Refactoring

      1. Create ViewSets: Define the UserViewSet and SnippetViewSet.
      2. Use @action decorator: Add custom actions to the SnippetViewSet.
      3. Bind ViewSets to URLs: Use DefaultRouter to automatically generate URL patterns.

      Updated Code

      1. Creating the ViewSets

      In snippets/views.py:

      ```python from rest_framework import viewsets, permissions, renderers from rest_framework.decorators import action from rest_framework.response import Response from .models import Snippet from .serializers import SnippetSerializer, UserSerializer from django.contrib.auth.models import User from .permissions import IsOwnerOrReadOnly

      class UserViewSet(viewsets.ReadOnlyModelViewSet): """ This viewset automatically provides list and retrieve actions. """ queryset = User.objects.all() serializer_class = UserSerializer

      class SnippetViewSet(viewsets.ModelViewSet): """ This viewset automatically provides list, create, retrieve, update, and destroy actions. Additionally, we provide a custom highlight action. """ queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly, IsOwnerOrReadOnly]

      @action(detail=True, renderer_classes=[renderers.StaticHTMLRenderer])
      def highlight(self, request, *args, **kwargs):
          snippet = self.get_object()
          return Response(snippet.highlighted)
      
      def perform_create(self, serializer):
          serializer.save(owner=self.request.user)
      

      ```

      2. Binding ViewSets to URLs Explicitly (if not using a Router)

      In snippets/urls.py, you can manually bind the ViewSet actions to URL patterns. This is not necessary if you use a Router but is useful for understanding how things work under the hood:

      ```python from django.urls import path from rest_framework.urlpatterns import format_suffix_patterns from snippets.views import SnippetViewSet, UserViewSet, api_root

      snippet_list = SnippetViewSet.as_view({ 'get': 'list', 'post': 'create' }) snippet_detail = SnippetViewSet.as_view({ 'get': 'retrieve', 'put': 'update', 'patch': 'partial_update', 'delete': 'destroy' }) snippet_highlight = SnippetViewSet.as_view({ 'get': 'highlight' }, renderer_classes=[renderers.StaticHTMLRenderer]) user_list = UserViewSet.as_view({ 'get': 'list' }) user_detail = UserViewSet.as_view({ 'get': 'retrieve' })

      urlpatterns = format_suffix_patterns([ path('', api_root), path('snippets/', snippet_list, name='snippet-list'), path('snippets/<int:pk>/', snippet_detail, name='snippet-detail'), path('snippets/<int:pk>/highlight/', snippet_highlight, name='snippet-highlight'), path('users/', user_list, name='user-list'), path('users/<int:pk>/', user_detail, name='user-detail') ]) ```

      3. Using Routers

      To simplify the URL configuration, use a Router. In snippets/urls.py:

      ```python from django.urls import path, include from rest_framework.routers import DefaultRouter from snippets import views

      Create a router and register our viewsets with it.

      router = DefaultRouter() router.register(r'snippets', views.SnippetViewSet, basename='snippet') router.register(r'users', views.UserViewSet, basename='user')

      The API URLs are now determined automatically by the router.

      urlpatterns = [ path('', include(router.urls)), path('api-auth/', include('rest_framework.urls', namespace='rest_framework')) ] ```

      How It Works

      1. ViewSets: Group related view logic (list, retrieve, create, update, delete) into a single class.
      2. Routers: Automatically generate URL patterns for ViewSets based on common conventions.
      3. Custom Actions: Use the @action decorator to add custom endpoints that don't fit the standard CRUD operations.

      Benefits

      • Simplifies URL Configuration: Routers handle the URL routing automatically.
      • Combines Related Views: ViewSets group related views into a single class, reducing redundancy.
      • Flexible: Easily add custom actions to ViewSets.

      By refactoring to use ViewSets and Routers, your code becomes cleaner, more maintainable, and easier to understand. The framework handles much of the repetitive boilerplate code, allowing you to focus on the unique aspects of your application.

    3. Refactoring to use ViewSets Let's take our current set of views, and refactor them into view sets. First of all let's refactor our UserList and UserDetail classes into a single UserViewSet class. In the snippets/views.py file, we can remove the two view classes and replace them with a single ViewSet class: from rest_framework import viewsets class UserViewSet(viewsets.ReadOnlyModelViewSet): """ This viewset automatically provides `list` and `retrieve` actions. """ queryset = User.objects.all() serializer_class = UserSerializer Here we've used the ReadOnlyModelViewSet class to automatically provide the default 'read-only' operations. We're still setting the queryset and serializer_class attributes exactly as we did when we were using regular views, but we no longer need to provide the same information to two separate classes. Next we're going to replace the SnippetList, SnippetDetail and SnippetHighlight view classes. We can remove the three views, and again replace them with a single class.

      Let's continue with refactoring our views by combining the SnippetList, SnippetDetail, and SnippetHighlight classes into a single SnippetViewSet class. This will make our code more concise and maintainable.

      Refactoring Steps

      1. Refactor User Views: Combine UserList and UserDetail into UserViewSet.
      2. Refactor Snippet Views: Combine SnippetList, SnippetDetail, and SnippetHighlight into SnippetViewSet.
      3. Update URLs: Use a router to generate URL patterns automatically.

      Step 1: Refactor User Views

      In snippets/views.py, refactor the user views:

      ```python from rest_framework import viewsets from django.contrib.auth.models import User from .serializers import UserSerializer

      class UserViewSet(viewsets.ReadOnlyModelViewSet): """ This viewset automatically provides list and retrieve actions. """ queryset = User.objects.all() serializer_class = UserSerializer ```

      Step 2: Refactor Snippet Views

      In snippets/views.py, refactor the snippet views:

      ```python from rest_framework import viewsets, renderers from .models import Snippet from .serializers import SnippetSerializer

      class SnippetViewSet(viewsets.ModelViewSet): """ This viewset automatically provides list, create, retrieve, update, and destroy actions. Additionally, we also provide an extra highlight action. """ queryset = Snippet.objects.all() serializer_class = SnippetSerializer

      @action(detail=True, renderer_classes=[renderers.StaticHTMLRenderer])
      def highlight(self, request, *args, **kwargs):
          snippet = self.get_object()
          return Response(snippet.highlighted)
      

      ```

      Step 3: Update URLs

      In snippets/urls.py, register the viewsets with a router and update the URL patterns:

      ```python from django.urls import path, include from rest_framework.routers import DefaultRouter from snippets import views

      Create a router and register our viewsets with it.

      router = DefaultRouter() router.register(r'snippets', views.SnippetViewSet) router.register(r'users', views.UserViewSet)

      The API URLs are now determined automatically by the router.

      Additionally, we include the login URLs for the browsable API.

      urlpatterns = [ path('', include(router.urls)), path('api-auth/', include('rest_framework.urls', namespace='rest_framework')) ] ```

      How It Works

      1. UserViewSet: Combines the UserList and UserDetail views into a single viewset that handles read-only operations.
      2. SnippetViewSet: Combines the SnippetList, SnippetDetail, and SnippetHighlight views into a single viewset. The highlight action is defined as a custom action within the viewset.
      3. Router: The DefaultRouter generates URL patterns automatically, so you don't have to define them manually.

      Example in Action

      1. User Request:
      2. GET /users/: Lists all users (handled by UserViewSet).
      3. GET /users/1/: Retrieves details for user with ID 1 (handled by UserViewSet).
      4. GET /snippets/: Lists all snippets (handled by SnippetViewSet).
      5. GET /snippets/1/: Retrieves details for snippet with ID 1 (handled by SnippetViewSet).
      6. GET /snippets/1/highlight/: Retrieves highlighted HTML for snippet with ID 1 (handled by SnippetViewSet).

      By refactoring to use ViewSets and Routers, our code becomes cleaner and more maintainable, and we let the framework handle much of the repetitive boilerplate code for us.

    1. Hyperlinking our API Dealing with relationships between entities is one of the more challenging aspects of Web API design. There are a number of different ways that we might choose to represent a relationship: Using primary keys. Using hyperlinking between entities. Using a unique identifying slug field on the related entity. Using the default string representation of the related entity. Nesting the related entity inside the parent representation. Some other custom representation. REST framework supports all of these styles, and can apply them across forward or reverse relationships, or apply them across custom managers such as generic foreign keys. In this case we'd like to use a hyperlinked style between entities. In order to do so, we'll modify our serializers to extend HyperlinkedModelSerializer instead of the existing ModelSerializer. The HyperlinkedModelSerializer has the following differences from ModelSerializer: It does not include the id field by default. It includes a url field, using HyperlinkedIdentityField. Relationships use HyperlinkedRelatedField, instead of PrimaryKeyRelatedField. We can easily re-write our existing serializers to use hyperlinking. In your snippets/serializers.py add: class SnippetSerializer(serializers.HyperlinkedModelSerializer): owner = serializers.ReadOnlyField(source='owner.username') highlight = serializers.HyperlinkedIdentityField(view_name='snippet-highlight', format='html') class Meta: model = Snippet fields = ['url', 'id', 'highlight', 'owner', 'title', 'code', 'linenos', 'language', 'style'] class UserSerializer(serializers.HyperlinkedModelSerializer): snippets = serializers.HyperlinkedRelatedField(many=True, view_name='snippet-detail', read_only=True) class Meta: model = User fields = ['url', 'id', 'username', 'snippets'] Notice that we've also added a new 'highlight' field. This field is of the same type as the url field, except that it points to the 'snippet-highlight' url pattern, instead of the 'snippet-detail' url pattern. Because we've included format suffixed URLs such as '.json', we also need to indicate on the highlight field that any format suffixed hyperlinks it returns should use the '.html' suffix. Making sure our URL patterns are named If we're going to have a hyperlinked API, we need to make sure we name our URL patterns. Let's take a look at which URL patterns we need to name. The root of our API refers to 'user-list' and 'snippet-list'. Our snippet serializer includes a field that refers to 'snippet-highlight'. Our user serializer includes a field that refers to 'snippet-detail'. Our snippet and user serializers include 'url' fields that by default will refer to '{model_name}-detail', which in this case will be 'snippet-detail' and 'user-detail'. After adding all those names into our URLconf, our final snippets/urls.py file should look like this: from django.urls import path from rest_framework.urlpatterns import format_suffix_patterns from snippets import views # API endpoints urlpatterns = format_suffix_patterns([ path('', views.api_root), path('snippets/', views.SnippetList.as_view(), name='snippet-list'), path('snippets/<int:pk>/', views.SnippetDetail.as_view(), name='snippet-detail'), path('snippets/<int:pk>/highlight/', views.SnippetHighlight.as_view(), name='snippet-highlight'), path('users/', views.UserList.as_view(), name='user-list'), path('users/<int:pk>/', views.UserDetail.as_view(), name='user-detail') ])

      Let's break down how to create a hyperlinked API in Django REST framework, step by step, with an example.

      What is a Hyperlinked API?

      A hyperlinked API means that instead of using primary keys to reference related objects, we use URLs (hyperlinks). This makes the API more intuitive and easier to navigate.

      Steps to Create a Hyperlinked API

      1. Update Serializers:
      2. Use HyperlinkedModelSerializer instead of ModelSerializer.
      3. Add URL fields to represent relationships as hyperlinks.

      4. Update URL Patterns:

      5. Name the URL patterns so that they can be referenced by the serializers.

      Example

      Step 1: Update Serializers

      In snippets/serializers.py, update your serializers to use HyperlinkedModelSerializer:

      ```python from rest_framework import serializers from .models import Snippet from django.contrib.auth.models import User

      class SnippetSerializer(serializers.HyperlinkedModelSerializer): owner = serializers.ReadOnlyField(source='owner.username') highlight = serializers.HyperlinkedIdentityField(view_name='snippet-highlight', format='html')

      class Meta:
          model = Snippet
          fields = ['url', 'id', 'highlight', 'owner', 'title', 'code', 'linenos', 'language', 'style']
      

      class UserSerializer(serializers.HyperlinkedModelSerializer): snippets = serializers.HyperlinkedRelatedField(many=True, view_name='snippet-detail', read_only=True)

      class Meta:
          model = User
          fields = ['url', 'id', 'username', 'snippets']
      

      ```

      Explanation: - SnippetSerializer: - owner: Read-only field that shows the username of the snippet owner. - highlight: Hyperlinked field pointing to the 'snippet-highlight' URL. - fields: List of fields to include in the serialized output.

      • UserSerializer:
      • snippets: Hyperlinked field that shows all snippets owned by the user, pointing to the 'snippet-detail' URL.
      • fields: List of fields to include in the serialized output.

      Step 2: Update URL Patterns

      In snippets/urls.py, name your URL patterns:

      ```python from django.urls import path from rest_framework.urlpatterns import format_suffix_patterns from snippets import views

      API endpoints

      urlpatterns = format_suffix_patterns([ path('', views.api_root), path('snippets/', views.SnippetList.as_view(), name='snippet-list'), path('snippets/<int:pk>/', views.SnippetDetail.as_view(), name='snippet-detail'), path('snippets/<int:pk>/highlight/', views.SnippetHighlight.as_view(), name='snippet-highlight'), path('users/', views.UserList.as_view(), name='user-list'), path('users/<int:pk>/', views.UserDetail.as_view(), name='user-detail') ]) ```

      Explanation: - format_suffix_patterns: Allows adding format suffixes like .json or .html to URLs. - path: Defines URL patterns and associates them with views. - name: Names the URL patterns so that they can be referenced by the serializers.

      How It Works

      1. Request: When a user requests a snippet or user detail, the serializer returns URLs for related objects instead of primary keys.
      2. Navigation: The user can follow these URLs to navigate between related objects.

      Example in Action

      1. User Request: GET /snippets/
      2. Response: json [ { "url": "http://example.com/snippets/1/", "id": 1, "highlight": "http://example.com/snippets/1/highlight/", "owner": "user1", "title": "Example Snippet", "code": "print('Hello, World!')", "linenos": true, "language": "python", "style": "friendly" } ]

      Here, the owner field is a username, and the highlight and url fields are hyperlinks to the related endpoints.

      This is how you create a hyperlinked API using Django REST framework, making it easier to navigate relationships between entities.

    2. Creating an endpoint for the highlighted snippets The other obvious thing that's still missing from our pastebin API is the code highlighting endpoints. Unlike all our other API endpoints, we don't want to use JSON, but instead just present an HTML representation. There are two styles of HTML renderer provided by REST framework, one for dealing with HTML rendered using templates, the other for dealing with pre-rendered HTML. The second renderer is the one we'd like to use for this endpoint. The other thing we need to consider when creating the code highlight view is that there's no existing concrete generic view that we can use. We're not returning an object instance, but instead a property of an object instance. Instead of using a concrete generic view, we'll use the base class for representing instances, and create our own .get() method. In your snippets/views.py add: from rest_framework import renderers class SnippetHighlight(generics.GenericAPIView): queryset = Snippet.objects.all() renderer_classes = [renderers.StaticHTMLRenderer] def get(self, request, *args, **kwargs): snippet = self.get_object() return Response(snippet.highlighted) As usual we need to add the new views that we've created in to our URLconf. We'll add a url pattern for our new API root in snippets/urls.py: path('', views.api_root), And then add a url pattern for the snippet highlights: path('snippets/<int:pk>/highlight/', views.SnippetHighlight.as_view()),

      Let's break down how to create an endpoint for code highlighting in a simple way, with an example:

      What is an Endpoint?

      An endpoint is a specific URL where our web application can send requests to get or send data. In this case, we want to create an endpoint to highlight code snippets and return an HTML representation instead of JSON.

      Steps to Create the Endpoint

      1. Create the View:
      2. We will create a view called SnippetHighlight that will handle requests to highlight a code snippet.
      3. This view will use a special renderer to return HTML instead of JSON.
      4. Since there's no built-in view that fits our need exactly, we will create a custom view by extending GenericAPIView.

      5. Update URLs:

      6. We will add a new URL pattern to link to our SnippetHighlight view.

      Example

      Step 1: Create the View

      First, we create our custom view in snippets/views.py:

      ```python from rest_framework import generics, renderers from rest_framework.response import Response from .models import Snippet

      class SnippetHighlight(generics.GenericAPIView): queryset = Snippet.objects.all() renderer_classes = [renderers.StaticHTMLRenderer]

      def get(self, request, *args, **kwargs):
          snippet = self.get_object()
          return Response(snippet.highlighted)
      

      ```

      Explanation: - Import Statements: We import necessary modules. - SnippetHighlight Class: This class handles the requests to highlight snippets. - queryset: Specifies which snippets are available. - renderer_classes: Tells Django to use HTML renderer instead of JSON renderer. - get() Method: This method handles GET requests. It fetches the requested snippet and returns its highlighted HTML.

      Step 2: Update URLs

      Next, we add the URL pattern in snippets/urls.py:

      ```python from django.urls import path from . import views

      urlpatterns = [ path('', views.api_root), # Your API root path('snippets/<int:pk>/highlight/', views.SnippetHighlight.as_view()), # URL for highlighting ] ```

      Explanation: - path('', views.api_root): This is the root of our API. - path('snippets/<int:pk>/highlight/', views.SnippetHighlight.as_view()): This URL pattern connects to our SnippetHighlight view. <int:pk> is a placeholder for the snippet's ID.

      How It Works

      1. Request: A user sends a GET request to /snippets/1/highlight/ to highlight snippet with ID 1.
      2. View: The SnippetHighlight view handles the request. It fetches the snippet with ID 1, gets its highlighted HTML, and returns it.
      3. Response: The user receives the highlighted HTML of the snippet.

      Example in Action

      1. User Request: GET /snippets/1/highlight/
      2. Backend Processing:
      3. The view fetches snippet 1 from the database.
      4. It gets the highlighted HTML of snippet 1.
      5. Response: The user gets the HTML representation of the highlighted code snippet.

      This is how you create an endpoint to highlight code snippets and return HTML using Django REST Framework.

    1. Authenticating with the API Because we now have a set of permissions on the API, we need to authenticate our requests to it if we want to edit any snippets. We haven't set up any authentication classes, so the defaults are currently applied, which are SessionAuthentication and BasicAuthentication. When we interact with the API through the web browser, we can login, and the browser session will then provide the required authentication for the requests. If we're interacting with the API programmatically we need to explicitly provide the authentication credentials on each request. If we try to create a snippet without authenticating, we'll get an error: http POST http://127.0.0.1:8000/snippets/ code="print(123)" { "detail": "Authentication credentials were not provided." } We can make a successful request by including the username and password of one of the users we created earlier. http -a admin:password123 POST http://127.0.0.1:8000/snippets/ code="print(789)" { "id": 1, "owner": "admin", "title": "foo", "code": "print(789)", "linenos": false, "language": "python", "style": "friendly" } Summary We've now got a fairly fine-grained set of permissions on our Web API, and end points for users of the system and for the code snippets that they have created. In part 5 of the tutorial we'll look at how we can tie everything together by creating an HTML endpoint for our highlighted snippets, and improve the cohesion of our API by using hyperlinking for the relationships within the system.

      Authenticating with the API

      Now that we have set permissions on the API, it's essential to authenticate our requests if we want to perform actions like creating, updating, or deleting snippets.

      Default Authentication Classes

      By default, Django REST Framework uses the following authentication classes: - SessionAuthentication: Uses Django's session framework. - BasicAuthentication: Uses HTTP Basic Authentication.

      When interacting with the API through a web browser, logging in through the browser session provides the required authentication for subsequent requests. However, for programmatic interaction, you need to explicitly include authentication credentials with each request.

      Example of an Unauthenticated Request

      If you try to create a snippet without authenticating, you will receive an error:

      Unauthenticated Request Example

      bash http POST http://127.0.0.1:8000/snippets/ code="print(123)"

      Response

      json { "detail": "Authentication credentials were not provided." }

      Example of an Authenticated Request

      To make an authenticated request, include the username and password of one of the users you created earlier.

      Authenticated Request Example

      Using httpie (a command-line HTTP client):

      bash http -a admin:password123 POST http://127.0.0.1:8000/snippets/ code="print(789)" title="foo"

      Response

      json { "id": 1, "owner": "admin", "title": "foo", "code": "print(789)", "linenos": false, "language": "python", "style": "friendly" }

      Summary

      • Permissions: The API now has fine-grained permissions to ensure only authenticated users can create, update, or delete snippets.
      • Authentication: Use either session-based or basic authentication for interacting with the API.
      • Programmatic Access: Include the username and password in the request to authenticate programmatically.

      Next Steps

      In the next part of the tutorial, we'll: - Create an HTML endpoint for highlighted snippets. - Enhance the cohesion of the API by using hyperlinking for the relationships within the system.

      This will further improve the usability and functionality of our web API, making it more intuitive and user-friendly.

    2. Object level permissions Really we'd like all code snippets to be visible to anyone, but also make sure that only the user that created a code snippet is able to update or delete it. To do that we're going to need to create a custom permission. In the snippets app, create a new file, permissions.py from rest_framework import permissions class IsOwnerOrReadOnly(permissions.BasePermission): """ Custom permission to only allow owners of an object to edit it. """ def has_object_permission(self, request, view, obj): # Read permissions are allowed to any request, # so we'll always allow GET, HEAD or OPTIONS requests. if request.method in permissions.SAFE_METHODS: return True # Write permissions are only allowed to the owner of the snippet. return obj.owner == request.user Now we can add that custom permission to our snippet instance endpoint, by editing the permission_classes property on the SnippetDetail view class: permission_classes = [permissions.IsAuthenticatedOrReadOnly, IsOwnerOrReadOnly] Make sure to also import the IsOwnerOrReadOnly class. from snippets.permissions import IsOwnerOrReadOnly Now, if you open a browser again, you find that the 'DELETE' and 'PUT' actions only appear on a snippet instance endpoint if you're logged in as the same user that created the code snippet.

      To ensure that all code snippets are visible to everyone but only the user who created a snippet can update or delete it, you can create a custom permission class. This custom permission will be added to the snippet instance endpoint to enforce these rules.

      Step-by-Step Instructions

      1. Create Custom Permission Class: Create a new file called permissions.py in your snippets app directory and define a custom permission class IsOwnerOrReadOnly.

      2. Update View to Use Custom Permission: Modify the SnippetDetail view to use the custom permission class in addition to the IsAuthenticatedOrReadOnly permission class.

      Step 1: Create Custom Permission Class

      snippets/permissions.py

      ```python from rest_framework import permissions

      class IsOwnerOrReadOnly(permissions.BasePermission): """ Custom permission to only allow owners of an object to edit it. """

      def has_object_permission(self, request, view, obj):
          # Read permissions are allowed to any request,
          # so we'll always allow GET, HEAD or OPTIONS requests.
          if request.method in permissions.SAFE_METHODS:
              return True
      
          # Write permissions are only allowed to the owner of the snippet.
          return obj.owner == request.user
      

      ```

      Step 2: Update the View to Use Custom Permission

      views.py

      First, import the custom permission class at the top of your views.py file:

      python from snippets.permissions import IsOwnerOrReadOnly

      Then, update the SnippetDetail view to include the custom permission in the permission_classes property:

      ```python from rest_framework import generics from rest_framework import permissions from .models import Snippet from .serializers import SnippetSerializer

      class SnippetList(generics.ListCreateAPIView): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly]

      def perform_create(self, serializer):
          serializer.save(owner=self.request.user)
      

      class SnippetDetail(generics.RetrieveUpdateDestroyAPIView): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly, IsOwnerOrReadOnly] ```

      Explanation of the Code

      • Custom Permission Class (IsOwnerOrReadOnly):
      • permissions.BasePermission: This is the base class for all permissions in Django REST Framework.
      • has_object_permission: This method checks whether the request has the required permissions for a specific object.

        • Read Permissions: Always allow safe methods (GET, HEAD, OPTIONS).
        • Write Permissions: Only allow if the user making the request is the owner of the object.
      • SnippetDetail View:

      • permission_classes: Combines IsAuthenticatedOrReadOnly (which allows read access to everyone and write access only to authenticated users) with IsOwnerOrReadOnly (which restricts write access to the owner of the snippet).

      What Happens Now

      • Read Access: Any user (authenticated or not) can read (list and retrieve) snippets.
      • Write Access: Only authenticated users can create snippets, and only the owner of a snippet can update or delete it.

      Testing the Setup

      1. Open the Browsable API: Navigate to a snippet instance endpoint in your browser.
      2. Check Actions: You should see the 'DELETE' and 'PUT' actions only if you are logged in as the user who created the snippet.

      Summary

      • Purpose: Ensure all code snippets are visible to everyone, but only the creator can update or delete their snippets.
      • Implementation: Create a custom permission class and apply it to the SnippetDetail view.
      • Result: Proper access control is enforced, allowing only the snippet owner to modify their snippet while everyone can read the snippets.

      This setup ensures that your API adheres to the required permissions, providing both visibility and security.

    3. Adding login to the Browsable API If you open a browser and navigate to the browsable API at the moment, you'll find that you're no longer able to create new code snippets. In order to do so we'd need to be able to login as a user. We can add a login view for use with the browsable API, by editing the URLconf in our project-level urls.py file. Add the following import at the top of the file: from django.urls import path, include And, at the end of the file, add a pattern to include the login and logout views for the browsable API. urlpatterns += [ path('api-auth/', include('rest_framework.urls')), ] The 'api-auth/' part of pattern can actually be whatever URL you want to use. Now if you open up the browser again and refresh the page you'll see a 'Login' link in the top right of the page. If you log in as one of the users you created earlier, you'll be able to create code snippets again. Once you've created a few code snippets, navigate to the '/users/' endpoint, and notice that the representation includes a list of the snippet ids that are associated with each user, in each user's 'snippets' field.

      To add login functionality to the browsable API in Django REST Framework, you need to include the authentication URLs in your project’s urls.py file. This will allow users to log in and log out via the browsable API interface.

      Step-by-Step Instructions

      1. Import the Required Modules: Add the necessary imports at the top of your urls.py file.
      2. Include the Authentication URLs: Add a URL pattern to include the login and logout views for the browsable API.

      Code Example

      project-level urls.py

      First, import the necessary modules:

      python from django.urls import path, include

      Then, add the authentication URL pattern:

      ```python from django.contrib import admin from django.urls import path, include

      urlpatterns = [ path('admin/', admin.site.urls), path('api/', include('your_app.urls')), # Include your app's URLs path('api-auth/', include('rest_framework.urls')), # Add this line ] ```

      Explanation of the Code

      • path('api-auth/', include('rest_framework.urls')): This line adds the authentication URLs provided by Django REST Framework. It allows users to log in and log out through the browsable API.

      What Happens Now

      • Login Link: When you navigate to the browsable API in your browser, you will see a "Login" link in the top right corner.
      • Login and Logout: Clicking on the "Login" link will take you to a login page where you can enter your credentials to log in. Once logged in, you can create, update, and delete snippets if you have the necessary permissions.

      Testing the Setup

      1. Open the Browsable API: Navigate to the browsable API in your browser.
      2. Login: Click on the "Login" link in the top right corner and log in with a user account.
      3. Create Snippets: Once logged in, you will be able to create new code snippets and perform other actions that require authentication.

      Example

      Let's assume you have already created a few users and code snippets. After logging in as one of these users, you can create a new snippet via the browsable API.

      Creating a New Snippet

      1. Navigate to the endpoint for creating snippets (e.g., /api/snippets/).
      2. Fill in the details for the new snippet and submit the form.

      Viewing Users

      Navigate to the /api/users/ endpoint. The representation will include a list of snippet IDs associated with each user in the snippets field.

      Summary

      • Purpose: Adding login functionality to the browsable API allows users to authenticate and perform actions that require login.
      • Implementation: Include the rest_framework.urls in your urls.py file.
      • Result: Users can log in and log out via the browsable API, enabling them to create, update, and delete snippets.

      This setup enhances the usability of your API, making it easier for users to interact with it directly through the browser.

    4. Adding required permissions to views Now that code snippets are associated with users, we want to make sure that only authenticated users are able to create, update and delete code snippets. REST framework includes a number of permission classes that we can use to restrict who can access a given view. In this case the one we're looking for is IsAuthenticatedOrReadOnly, which will ensure that authenticated requests get read-write access, and unauthenticated requests get read-only access. First add the following import in the views module from rest_framework import permissions Then, add the following property to both the SnippetList and SnippetDetail view classes. permission_classes = [permissions.IsAuthenticatedOrReadOnly]

      To ensure that only authenticated users can create, update, or delete code snippets while allowing unauthenticated users to read the snippets, we can use the IsAuthenticatedOrReadOnly permission class from Django REST Framework.

      Here's how you can implement this in your views:

      Step-by-Step Instructions

      1. Import Permissions: First, import the permissions module from Django REST Framework.
      2. Set Permission Classes: Add the permission_classes property to both the SnippetList and SnippetDetail view classes, setting it to IsAuthenticatedOrReadOnly.

      Code Example

      views.py

      First, import the permissions at the top of your views.py file:

      python from rest_framework import permissions

      Then, update your view classes to include the permission_classes property:

      ```python from rest_framework import generics from .models import Snippet from .serializers import SnippetSerializer

      class SnippetList(generics.ListCreateAPIView): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly]

      def perform_create(self, serializer):
          serializer.save(owner=self.request.user)
      

      class SnippetDetail(generics.RetrieveUpdateDestroyAPIView): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [permissions.IsAuthenticatedOrReadOnly] ```

      Explanation of the Code

      • permission_classes: This property specifies the permissions that are required to access the view.
      • permissions.IsAuthenticatedOrReadOnly: This permission class ensures that:
      • Authenticated users (logged in) can perform any action (read, create, update, delete).
      • Unauthenticated users (not logged in) can only read the data (list and retrieve).

      What Happens Now

      • Authenticated Users:
      • Can create new snippets.
      • Can update existing snippets.
      • Can delete snippets.
      • Can read (list and retrieve) snippets.
      • Unauthenticated Users:
      • Can only read (list and retrieve) snippets.
      • Cannot create new snippets.
      • Cannot update snippets.
      • Cannot delete snippets.

      Example Usage

      As an Authenticated User

      When a logged-in user sends a POST request to create a snippet, it will be allowed because they have read-write access.

      bash curl -X POST -H "Authorization: Token <user-token>" -d '{"title": "New Snippet", "code": "print(123)"}' http://example.com/snippets/

      As an Unauthenticated User

      When a not logged-in user tries to send a POST request to create a snippet, it will be denied because they only have read-only access.

      bash curl -X POST -d '{"title": "New Snippet", "code": "print(123)"}' http://example.com/snippets/

      This request will result in a 403 Forbidden response, indicating that the user does not have permission to perform the action.

      Summary

      • Purpose: To restrict create, update, and delete actions to authenticated users, while allowing unauthenticated users to read data.
      • Implementation: Use the IsAuthenticatedOrReadOnly permission class in the view classes.
      • Effect: Authenticated users get full access, while unauthenticated users get read-only access.

      This setup helps protect your data by ensuring that only users who are logged in can modify it, while still allowing anyone to view the data.

    5. Updating our serializer Now that snippets are associated with the user that created them, let's update our SnippetSerializer to reflect that. Add the following field to the serializer definition in serializers.py: owner = serializers.ReadOnlyField(source='owner.username') Note: Make sure you also add 'owner', to the list of fields in the inner Meta class. This field is doing something quite interesting. The source argument controls which attribute is used to populate a field, and can point at any attribute on the serialized instance. It can also take the dotted notation shown above, in which case it will traverse the given attributes, in a similar way as it is used with Django's template language. The field we've added is the untyped ReadOnlyField class, in contrast to the other typed fields, such as CharField, BooleanField etc... The untyped ReadOnlyField is always read-only, and will be used for serialized representations, but will not be used for updating model instances when they are deserialized. We could have also used CharField(read_only=True) here.

      Let's break down how to update the SnippetSerializer to include the owner field and explain what this change does in simple terms.

      Step-by-Step Explanation

      1. Add the owner Field: In your SnippetSerializer, add a new field called owner that will show the username of the user who created the snippet.

      2. Update the Meta Class: Make sure to include the owner field in the list of fields in the serializer's Meta class.

      Code Example

      Here's how you can update your SnippetSerializer:

      ```python from rest_framework import serializers from .models import Snippet

      class SnippetSerializer(serializers.ModelSerializer): owner = serializers.ReadOnlyField(source='owner.username')

      class Meta:
          model = Snippet
          fields = ['id', 'title', 'code', 'linenos', 'language', 'style', 'owner']
      

      ```

      Explanation of the Code

      • owner Field:
      • serializers.ReadOnlyField: This type of field is read-only, meaning it is only used when the data is being sent out, not when data is being received.
      • source='owner.username': The source argument specifies which attribute to use to fill this field. owner.username means it will use the username attribute of the owner (the user who created the snippet).

      What This Field Does

      • Read-Only: The owner field is read-only, so it will show up when you serialize a snippet, but it won't be used when you create or update a snippet.
      • Source Attribute: The source attribute lets you specify which attribute of the model to use. In this case, owner.username will use the username of the user who owns the snippet.

      How It Works

      When you serialize a Snippet instance, the owner field will include the username of the user who created it.

      Example

      Let's say we have a user named Alice who has created a snippet. When we serialize that snippet, it will look like this:

      ```python

      Assuming snippet is an instance of Snippet created by user 'Alice'

      serializer = SnippetSerializer(snippet) print(serializer.data) ```

      Output

      json { "id": 1, "title": "Example Snippet", "code": "print('Hello, world!')", "linenos": true, "language": "python", "style": "friendly", "owner": "Alice" }

      Summary

      • Purpose: Adding the owner field to the SnippetSerializer allows us to show the username of the user who created each snippet.
      • Read-Only: The owner field is read-only, meaning it's used for displaying data but not for creating or updating snippets.
      • Source Attribute: The source='owner.username' part ensures that the field will display the username of the owner.

      This update makes your API responses more informative by including the creator's username with each snippet, providing more context and making it easier to understand who created each snippet.

    6. Associating Snippets with Users Right now, if we created a code snippet, there'd be no way of associating the user that created the snippet, with the snippet instance. The user isn't sent as part of the serialized representation, but is instead a property of the incoming request. The way we deal with that is by overriding a .perform_create() method on our snippet views, that allows us to modify how the instance save is managed, and handle any information that is implicit in the incoming request or requested URL. On the SnippetList view class, add the following method: def perform_create(self, serializer): serializer.save(owner=self.request.user) The create() method of our serializer will now be passed an additional 'owner' field, along with the validated data from the request.

      Sure! Let's break down the explanation into simpler terms with an example.

      Problem

      When a user creates a new code snippet in our application, we want to associate that snippet with the user who created it. However, the user information isn't directly included in the data sent to the server. Instead, it's part of the request that the server receives.

      Solution

      To handle this, we can override a method called perform_create() in our view. This method lets us customize how a new snippet is saved and allows us to add extra information, like the user who created it.

      Step-by-Step Explanation

      1. Define the Method: We add a method called perform_create() to our view class. This method takes care of saving the snippet with the user information.

      2. Add User Information: Inside the perform_create() method, we use the save() method on the serializer. We pass the current user (who is making the request) as the owner of the snippet.

      Code Example

      Let's see how this looks in code. First, we have a SnippetList view where users can create new snippets.

      ```python from rest_framework import generics from .models import Snippet from .serializers import SnippetSerializer from rest_framework.permissions import IsAuthenticated

      class SnippetList(generics.ListCreateAPIView): queryset = Snippet.objects.all() serializer_class = SnippetSerializer permission_classes = [IsAuthenticated]

      def perform_create(self, serializer):
          serializer.save(owner=self.request.user)
      

      ```

      Explanation of the Code

      • SnippetList View: This view handles listing and creating snippets.
      • perform_create() Method: This method is called when a new snippet is being created.
      • self.request.user: This represents the user who made the request.
      • serializer.save(owner=self.request.user): This saves the new snippet and sets the owner field to the current user.

      What Happens When a Snippet is Created

      1. User Makes a Request: A user (let's say Alice) sends a request to create a new snippet.
      2. Request Processed: The request is received by the SnippetList view.
      3. perform_create() Called: The perform_create() method is called.
      4. User Information Added: The snippet is saved with the owner field set to Alice.
      5. Snippet Created: The new snippet is now associated with Alice as its owner.

      Summary

      • Problem: We need to associate a snippet with the user who created it, but the user info isn't directly included in the data sent to the server.
      • Solution: Override the perform_create() method in the view to add the user information before saving the snippet.
      • Result: The new snippet is saved with the current user as its owner, making it easy to track which user created which snippet.

      This approach ensures that each snippet is correctly associated with the user who created it, even though the user information isn't explicitly part of the data sent by the client.

    1. Findings The similarity between samples can be determined using indexed k-mer sequence variants. To increase statistical power, we use coverage information on variant sites, calculating similarity using a likelihood ratio-based test. Per sample error rate, and coverage bias (i.e. missing sites) can also be estimated with this information, which can be used to determine if a spatially indexed PCA-based pre-screening method can be used, which can greatly speed up analysis by preventing exhaustive all-to-all comparisons.

      Reviewer2: Qian Zhou In this paper, the authors have presented a tool, ntsm, which utilizes the k-mer distribution information directly from raw sequencing data for sample swap detection. The approach of bypassing the reference genome alignment step and saving computational resources is commendable. Utilizing k-mers for reference-free and de novo analysis of sequencing data is a valuable application. The authors have demonstrated the impressive performance of ntsm on low coverage data through experimental results presented in the manuscript, showcasing its strengths in terms of sensitivity, accuracy. However, while ntsm eliminates the need for reference genome alignment, it still relies on a pre-defined set of variant sites and pre-built PCA rotation matrices. This raises doubts about the true reference-free nature of ntsm and raises concerns about its generalizability to other species.Major comments:1.The concept of reference-free:I believe that ntsm's approach is not truly reference-free. In order to use ntsm, it requires the use of existing high-quality population SNP sites and kmers from the human reference genome. Additionally, the population PCA results are used to assist in pairwise comparisons between samples. Both of these information can only be obtained when a reference genome is available. A true referencefree tool would be applicable to species without a reference genome, such as SPLASH (Chaung et al., 2023, Cell). ntsm can be considered as an alignment-free or kmer-based tool.2.The reduction of computational costs:NTSM differs from Somalier in its computational workflow. To compare the computational costs or time, a holistic end-to-end comparison is necessary, rather than timing individual steps such as kmer counting and sample pairwise comparison separately. Conducting an end-to-end comparison for an analysis task allows users to have a comprehensive understanding of the tool's time and cost consumption. Furthermore, when comparing software, it is important to allocate computational resources fairly. For example, ntsm utilizes 16 threads in the 'Sample comparison process' stage, while for the 'k-mer counting (ntsm) vs. alignment (somalier)' stage, tools like bwa and minimap2, which can utilize multiple threads, were run using a single thread.3.Sensitivity and Specificity:More experimental details are needed. In the section 'Sensitivity and Specificity of Sample Swaps,' were the results obtained using the 39 HPRC samples? Did it include their Hi-C data?For Fig 6, did the results come from all sequencing datasets of the 39 samples, including Illumina and ONT? Since the results was obtained using full coverage, would the threshold change at lower coverage?For Fig 7, which demonstrates ntsm's results, was PCA information used as an auxiliary? Does the use of PCA information impact Sensitivity and Specificity?4.Regarding PCA-based method:The 39 HPRC samples used in the study are actually part of the 3,202 samples from the 1000 Genomes Project. Therefore, it is important to clarify whether the PCA matrix used in the study already includes information from these 39 samples. From a rigorous experimental design perspective, a precomputed PCA matrix should not include information from the 39 samples. Otherwise, the effect of the PCA matrix on these 39 samples may be overestimated. It raises questions about whether the same results can be achieved on non-1000 Genomes Project samples.5.The applicability of the tool:In order to expand the applicability of ntsm to a wider range of species, two aspects need to be addressed:1). Provide detailed information on customizing the sites file. From the site files available in ntsm code repository on GitHub, the process of selecting variant sites seems to be more complex than what is described in the manuscript, involving more than just SNP variants.2). The sites and PCA files should be user-customizable inputs instead of being built-in. This limitation restricts the application of ntsm to other species.Minor comments:The manuscript appears to have been hastily written and requires further polish by the authors.1. In Figure 6, A and B seem to be labeled incorrectly.2. In Figure 9, the two subplots have different y-axes, one labeled "min" and the other labeled "s." Could you clarify what each subplot is illustrating?3. When mentioning HPRC for the first time, it would be helpful to provide the full name and explanation of the acronym. However, the full explanation appears in the next paragraph.4. "We then keep only purine to pyrimidine (A or T to G or C) variants, as final insurance against possible human error influencing this tool" It seems there may be a mistake or confusion in the sentence. The writer should indeed mention "A/G <-> C/T" instead of "A/T <-> G/C" to accurately describe purine to pyrimidine variants. The writer may have made an error in describing the nucleotide exchange, or it could be a typographical mistake.5. There is a typo in the formula for estimating sequencing error rate. (nm)·log(1-… …

    1. [–]DistributionPure6051[S] comment score below threshold-14 points-13 points-12 points 2 days ago (8 children)Managed to grab it for $40. Don't know the model but I'm hoping to clean it up, replace the ribbon, and resell on eBay for a few bucks permalinkembedsaveparentreportreply[–]chrisaldrichMy typewriter addiction is almost as bad as my card index one 7 points8 points9 points 2 days ago (1 child)If that's your intention, you'd have been much better off getting it for $5-10 to get some margin for your work. If that's your intention, you'd have been much better off getting it for $5-10 to get some margin for your work.formatting helphide helpcontent policysavecancelreddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues. you type:you see:*italics*italics**bold**bold[reddit!](https://reddit.com)reddit!* item 1* item 2* item 3item 1item 2item 3> quoted textquoted textLines starting with four spacesare treated like code:    if 1 * 2 < 3:        print "hello, world!"Lines starting with four spacesare treated like code:if 1 * 2 < 3:    print "hello, world!"~~strikethrough~~strikethroughsuper^scriptsuperscriptpermalinkembedsaveparenteditdisable inbox repliesare you sure? yes / nodeleteare you sure? yes / noreply[–]DistributionPure6051[S] comment score below threshold-5 points-4 points-3 points 2 days ago (0 children)They wouldn't go lower than $40 permalinkembedsaveparentreportreply[–]Smubee 2 points3 points4 points 2 days ago (5 children)Don't do this. permalinkembedsaveparentreportreply[–]DistributionPure6051[S] comment score below threshold-6 points-5 points-4 points 1 day ago (4 children)Explain permalinkembedsaveparentreportreply[–]Neilgi 3 points4 points5 points 1 day ago* (3 children)Resellers kind of suck the life out of certain industries and make it difficult for hobbyists to get decent equipment. So long as you sell it for what it is WORTH and not upsell by 100%, then you perhaps aren't one of the bad guys. permalinkembedsaveparentreportreply[–]DistributionPure6051[S] -4 points-3 points-2 points 1 day ago (2 children)I'll take a look at the model and see if I can find its actual worth considering its wear just to try and make a profit. If it ends up actually being $50, oh well, maybe I could send it to a theater or props department in the area permalinkembedsaveparentreportreply[–]Smubee 1 point2 points3 points 1 day ago (1 child)This makes you an asshole. Don't buy shit just to make a profit. You're inflating a market unnecessarily. permalinkembedsaveparentreportreply[–]DistributionPure6051[S] -2 points-1 points0 points 1 day ago (0 children)Then I wasted $50 on a theater prop

      Typically in this sub, when people ask, "Is it worth it?" the presumption is that they're buying it to use for themselves. You left out your context of buying it to sell until later. This means that once you've cleaned things up, and go to try to sell it for something above $40, people are going to show up here and ask that same question. When they do, the answer is going to be that it's far too expensive, especially with shipping which is notoriously tricky, expensive, and risky.

      You'll be sitting there with a typewriter that you don't care enough about to have known anything about it or if it had any particular value. This also probably means that you don't know enough about what goes into cleaning and properly adjusting a typewriter either. If someone is a sucker enough to pay the crazy mark up, it means that someone who wants to try out a typewriter will be buying a sub-par machine and have a sub-par experience.

      unposted reply to u/DistributionPure6051 at https://www.reddit.com/r/typewriters/comments/1dqr02l/is_this_worth_it/<br /> (Most of context is hiding because of downvoting)

  2. Jun 2024
    1. It is interesting that, The Social Work Dictionary definition in the most current version (Barker, 2013) is the most comprehensive of definitions found in the extant literature; it states that social justice isan ideal condition in which all members of a society have the same basic rights, protection, opportunities, obligations, and social benefits. Implicit in this concept is the notion that historical inequalities should be acknowledged and remedied through specific measures. A key social work value, social justice entails advocacy to confront discrimination, oppression, and institutional inequities. (pp. 398–399).This definition is the most closely aligned with the Code of Ethics (NASW, 2021) in that it explicitly recognizes that social justice includes advocacy to address the inequalities that are identified in the guiding documents of the discipline.

      I find this part of the text to be the most impactful, because the article reports that even though this definition is the most aligned with the NASW Code of Ethics, only one out of one hundred and two articles that were reviewed in this study used this most updated version. 11% of the reviewed articles quoted The Social Work Dictionary, yet only one out of nine that were printed after this version was written was used. This could indicate that there may be differing views regarding social justice or a lack of understanding of the NASW Code of Ethics. Regardless of the reasons or variables, it signifies to me that this is a systemic issue that could be negatively impacting the social work field.

    2. Due to the definitional inconsistencies and the lack of agreement within the profession about the centrality of social justice, many educational practices, attitudes, and actions of those working within the profession may not align with socially just ideals that are included in the Code of Ethics and the EPAS (Longres & Scanlon, 2001; Reisch, 2010; Specht & Courtney, 1995). As academics debate the professionalism of social work, its commitment to its values and ethics, and the role of social justice, social work educators continue to educate students who may neither understand nor connect social justice to their social work practice, despite the guidance provided via the Code of Ethics and the EPAS (Finn, 2016; NASW, 2021; Longres & Scanlon, 2001).

      This brings clarity for me to read as I reflect on my past educational experience, my work history, and my struggle in understanding my role as a social worker. I studied social work at a state university over a decade ago. My memory may not be a reliable source, yet I do not remember social justice being a term we integrated into the educational courses, assignments, and discussions. There were aspects of social justice reviewed and explained, yet my understanding of it was a theory to understand the complexities of a client's situation and advocate for them. After graduating, I worked for organizations that were very clear about making social change in their communities to end oppression. It helped me apply how social justice can be integrated into my profession with intention. It additionally relayed the struggles with discussing social justice within an institution. Some people felt as though these discussions were “political” and should not be had. They were referencing how social justice can have the connotations of being a liberal political discussion. It makes me wonder if the inconsistencies of the definition is a part of the problem. I worked for different agencies that further perpetuated my views of social justice and practice as “political”. This has led me to openly question some of the organization’s commonly accepted practices, feeling hesitant in my role, and eventually feeling burned out in these positions. Reading this article makes me wish I would have quoted the NASW Code of Ethics in these organizations to help me feel like I had a valid foundation in my perspective, discussions, and concerns.

    1. we need to be able to deploy the app to the cluster

      deploy it with Tilt

      (now, I'm unsure whether Tilt is used here only because it automatically rebuilds the code, or because it is actually useful to debug it. If the code is already deployed, can I ignore Tilt?)

    2. The main goal is to ease the developer experience by helping with local continuous development and deployment of apps to local Kubernetes clusters. It does this by monitoring the source code and automatically building and pushing the deployments.

      Tilt rebuilds and pushes the deployments at each code change

    1. Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      - The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      - The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      - Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      - The authors make a prototype search engine available.

      - The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      - So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      - Materials sciences seem to be severely under-represented - commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are<br /> used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study presents a valuable tool for searching molecular dynamics simulation data, making such data sets accessible for open science. The authors provide convincing evidence that it is possible to identify useful molecular dynamics simulation data sets and their analysis can produce valuable information.

      Public Reviews

      Reviewer #1 (Public Review):

      Summary:

      Tiemann et al. have undertaken an original study on the availability of molecular dynamics (MD) simulation datasets across the Internet. There is a widespread belief that extensive, well-curated MD datasets would enable the development of novel classes of AI models for structural biology. However, currently, there is no standard for sharing MD datasets. As generating MD datasets is energy-intensive, it is also important to facilitate the reuse of MD datasets to minimize energy consumption. Developing a universally accepted standard for depositing and curating MD datasets is a huge undertaking. The study by Tiemann et al. will be very valuable in informing policy developments toward this goal.

      Strengths:

      The study presents an original approach to addressing a growing concern in the field. It is clear that adopting a more collaborative approach could significantly enhance the impact of MD simulations in modern molecular sciences.

      The timing of the work is appropriate, given the current interest in developing AI models for describing biomolecular dynamics.

      Weaknesses:

      The study primarily focuses on one major MD engine (GROMACS), although this limitation is not significant considering the proof-of-concept nature of the study.

      We thank the reviewer for his/her comments. Moving forward, our plan includes expanding this research to encompass other MD engines used in biomolecular simulations and materials sciences, such as NAMD, Charmm, Amber, LAMMPS, etc. However, this requires parsing associated files to supplement the sparse metadata generally available for the related datasets

      Reviewer #2 (Public Review):

      Summary:

      Molecular dynamics (MD) data is deposited in public, non-specialist repositories. This work starts from the premise that these data are a valuable resource as they could be used by other researchers to extract additional insights from these simulations; it could also potentially be used as training data for ML/AI approaches. The problem is that mining these data is difficult because they are not easy to find and work with. The primary goal of the authors was to discover and index these difficult-to-find MD datasets, which they call the "dark matter of the MD universe" (in contrast to data sets held in specialist databases).

      The authors developed a search strategy that avoided the use of ill-defined metadata but instead relied on the knowledge of the restricted set of file formats used in MD simulations as a true marker for the data they were looking for. Detection of MD data marked a data set as relevant with a follow-up indexing strategy of all associated content. This "explore-and-expand" strategy allowed the authors for the first time to provide a realistic census of the MD data in non-specialist repositories.

      As a proof of principle, they analyzed a subset of the data (primarily related to simulations with the popular Gromacs MD package) to summarize the types of simulated systems (primarily biomolecular systems) and commonly used simulation settings.

      Based on their experience they propose best practices for metadata provision to make MD data FAIR (findable, accessible, interoperable, reusable).

      A prototype search engine that works on the indexed datasets is made publicly available. All data and code are made freely available as open source/open data.

      Strengths:

      The novel search strategy is based on relevant data to identify full datasets instead of relying on metadata and thus is likely to have many true positives and few false positives.

      The paper provides a first glimpse at the potential hidden treasures of MD simulations and force field parametrizations of molecules.

      Analysis of parameter settings of MD simulations from how researchers *actually* run simulations can provide valuable feedback to MD code developers for how to document/educate users. This approach is much better than analyzing what authors write in the Methods sections.

      The authors make a prototype search engine available.

      The guidelines for FAIR MD data are based on experience gained from trying to make sense of the data.

      Weaknesses:

      So far the work is a proof-of-concept that focuses on MD data produced by Gromacs (which was prevalent under all indexed and identified packages).

      As discussed in the manuscript, some types of biomolecules are likely underrepresented because different communities have different preferences for force fields/MD codes (for example: carbohydrates with AMBER/GLYCAM using AMBER MD instead of Gromacs).

      Materials sciences seem to be severely under-represented --- commonly used codes in this area such as LAMMPS are not even detected, and only very few examples could be identified. As it is, the paper primarily provides an insight into the *biomolecular* MD simulation world.

      The authors succeed in providing a first realistic view on what MD data is available in public repositories. In particular, their explore-expand approach has the potential to be customized for all kinds of specialist simulation data, whereby specific artifacts are used as fiducial markers instead of metadata. The more detailed analysis is limited to Gromacs simulations and primarily biomolecular simulations (even though MD is also widely used in other fields such as the materials sciences). This restricted view may simply be correlated with the user community of Gromacs and hopefully, follow-up studies from this work will shed more light on this shortcoming.

      The study quantified the number of trajectories currently held in structured databases as ~10k vs ~30k in generalist repositories. To go beyond the proof-of-principle analysis it would be interesting to analyze the data in specialist repositories in the same way as the one in the generalist ones, especially as there are now efforts underway to create a database for MD simulations (Grant 'Molecular dynamics simulation for biology and chemistry research' to establish MDDB' DOI 10.3030/101094651). One should note that structured databases do not invalidate the approach pioneered in this work; if anything they are orthogonal to each other and both will likely play an important role in growing the usefulness of MD simulations in the future.

      We thank the reviewer for his/her comments. As mentioned to Reviewer 1, we intend to extend this work to other MD engines in the near future to go beyond Gromacs and even biomolecular simulations. Furthermore, as the value of accessing and indexing specialized MD databases such as MDDB, MemprotMD, GPCRmd, NMRLipids, ATLAS, and others has been mentioned by the reviewer, it is indeed one of our next steps to continue to expand the MDverse catalog of MD data. This indexing may also extend the visibility and widespreaded adoptability of these specific databases.

      Reviewer #3 (Public Review):

      Molecular dynamics (MD) simulations nowadays are an essential element of structural biology investigations, complementing experiments and aiding their interpretation by revealing transient processes or details (such as the effects of glycosylation on the SARS-CoV-2 spike protein, for example (Casalino et al. ACS Cent. Sci. 2020; 6, 10, 1722-1734 https://doi.org/10.1021/acscentsci.0c01056) that cannot be observed directly. MD simulations can allow for the calculation of thermodynamic, kinetic, and other properties and the prediction of biological or chemical activity. MD simulations can now serve as "computational assays" (Huggins et al. WIREs Comput Mol Sci. 2019; 9:e1393.

      https://doi.org/10.1002/wcms.1393). Conceptually, MD simulations have played a crucial role in developing the understanding that the dynamics and conformational behaviour of biological macromolecules are essential to their function, and are shaped by evolution. Atomistic simulations range up to the billion atom scale with exascale resources (e.g. simulations of SARS-CoV-2 in a respiratory aerosol. Dommer et al. The International Journal of High Performance Computing Applications. 2023; 37:28-44. doi:10.1177/10943420221128233), while coarse-grained models allow simulations on even larger length- and timescales. Simulations with combined quantum mechanics/molecular mechanics (QM/MM) methods can investigate biochemical reactivity, and overcome limitations of empirical forcefields (Cui et al. J. Phys. Chem. B 2021; 125, 689 https://doi.org/10.1021/acs.jpcb.0c09898).

      MD simulations generate large amounts of data (e.g. structures along the MD trajectory) and increasingly, e.g. because of funder mandates for open science, these data are deposited in publicly accessible repositories. There is real potential to learn from these data en masse, not only to understand biomolecular dynamics but also to explore methodological issues. Deposition of data is haphazard and lags far behind experimental structural biology, however, and it is also hard to answer the apparently simple question of "what is out there?". This is the question that Tiemann et al explore in this nice and important work, focusing on simulations run with the widely used GROMACS package. They develop a search strategy and identify almost 2,000 datasets from Zenodo, Figshare and Open Science Framework. This provides a very useful resource. For these datasets, they analyse features of the simulations (e.g. atomistic or coarse-grained), which provides a useful snapshot of current simulation approaches. The analysis is presented clearly and discussed insightfully. They also present a search engine to explore MD data, the MDverse data explorer, which promises to be a very useful tool.

      As the authors state: "Eventually, front-end solutions such as the MDverse data explorer tool can evolve being more user-friendly by interfacing the structures and dynamics with interactive 3D molecular viewers". This will make MD simulations accessible to non-specialists and researchers in other areas. I would envisage that this will also include approaches using interactive virtual reality for an immersive exploration of structure and dynamics, and virtual collaboration (e.g. O'Connor et al., Sci. Adv.4, eaat2731 (2018). DOI:10.1126/sciadv.aat2731)

      The need to share data effectively, and to compare simulations and test models, was illustrated clearly in the COVID-19 pandemic, which also demonstrated a willingness and commitment to data sharing across the international community (e.g. Amaro and Mulholland, J. Chem. Inf. Model. 2020, 60, 6, 2653-2656 https://doi.org/10.1021/acs.jcim.0c00319; Computing in Science & Engineering 2020, 22, 30-36 doi: 10.1109/MCSE.2020.3024155). There are important lessons to learn here, for simulations to be reproducible and reliable, for rapid testing, for exploiting data with machine learning, and for linking to data from other approaches. Tiemann et al. discuss how to develop these links, providing good perspectives and suggestions.

      I agree completely with the statement of the authors that "Even if MD data represents only 1 % of the total volume of data stored in Zenodo, we believe it is our responsibility, as a community, to develop a better sharing and reuse of MD simulation files - and it will neither have to be particularly cumbersome nor expensive. To this end, we are proposing two solutions. First, improve practices for sharing and depositing MD data in data repositories. Second, improve the FAIRness of already available MD data notably by improving the quality of the current metadata."

      This nicely states the challenge to the biomolecular simulation community. There is a clear need for standards for MD data and associated metadata. This will also help with the development of standards of best practice in simulations. The authors provide useful and detailed recommendations for MD metadata. These recommendations should contribute to discussions on the development of standards by researchers, funders, and publishers. Community organizations (such as CCP-BioSim and HECBioSim in the UK, BioExcel, CECAM, MolSSI, learned societies etc) have an important part to play in these developments, which are vital for the future of biomolecular simulation.

      We thank the reviewer for his/her comments. Beyond the points mentioned to Reviewers 1 and 2, as the reviewer suggested, it would be of great interest to combine innovative and immersive approaches to visualize and possibly interact with the data collected. This is indeed more and more amenable thanks to technologies such as WebGL and programs such as Mol*, or even - as also pointed out by the reviewer - through virtual reality, for example with the mentioned Narupa framework or with the UnityMol software. For a comprehensive review on MD trajectory visualization and associated challenges, we refer to our recent review article https://doi.org/10.3389/fbinf.2024.1356659.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Some minor text editing would improve the readability of the manuscript.

      It would be very useful if the authors could share their perspectives on the best and most efficient approach to sharing datasets and code associated with a publication. My concern lies in the fact that Github, which is currently the dominant platform for sharing code, is not well-suited for hosting large MD datasets. As a result, researchers often need to adopt a workflow where code is shared on Github and datasets are stored elsewhere (e.g., Zenodo). While this is feasible, it adds extra work. Ideally, a transparent process could be developed to seamlessly share code and datasets linked to a study through a unified interface.

      We thank the reviewer for this excellent suggestion. To our knowledge, there is yet no easy framework to jointly store and share code and data, linked to their scientific publication. Of course, code can be submitted to “generic” databases along with the data, but at the current state, those do not provide such useful features like collaborative work & track recording as done to the extent of GitHub.

      Although GitHub is indeed a suitable platform to deposit code, we strongly advise researchers to archive their code in Software Heritage. In addition to preserving source code, Software Heritage provides a unique identifier called SWHID that unambiguously makes reference to a specific version of the source code.

      So far, it is the responsibility of the scientific publication authors to link datasets and source codes (whether in GitHub or Software Heritage) in their paper, but also to make the reverse link from the data and code sharing platforms to the paper after publication.

      As mentioned by the reviewer, a unified interface that could ease this process would significantly contribute to FAIR-ness in MD.

      Reviewer #2 (Recommendations For The Authors):

      L180: I am not aware that TRR files contain energy terms as stated here, my understanding was that EDR files primarily served that purpose.

      “…available in one dataset. Interestingly, we found 1,406 .trr files, Which contain trajectory but also additional information such as velocities, energy of the system, etc’ While the file is especially useful in terms of reusability, the large size (can go up to several 100GB) limits its deposition in most…”

      Indeed, our formulation was ambiguous. The EDR files contain the detailed information on energies, whereas TRR files contain numerous values from the trajectory such as coordinates, velocities, forces and to some extent also energies

      (https://manual.gromacs.org/current/reference-manual/file-formats.html#trr)

      L207: The text states that the total time was not available from XTC files, only the number of frames. However, XTC files record time stamps in addition to frame numbers. As long as these times are in the Gromacs standard of picoseconds, the simulation time ought to be available from XTCs.

      “…systems and the number of frames available in the files (Fig. 3-B). Of note, the frames do not directly translate to the simulation runtime - more information deposited in other files (e.g. .mdp files) is needed to determine the complete runtime of the simulation. The system was up…”.

      Thank you for the useful comment, we removed this sentence. We now mention that studying the simulation time would be of interest in the future, especially when we will perform an exhaustive analysis of XTC files.

      “Of note, as .xtc files also contain time stamps, it would be interesting to study the relationship between the time and the number of frames to get useful information about the sampling. Nevertheless, this analysis would be possible only for unbiased MD simulations. So, we would need to decipher if the .xtc file is coming from biased or unbiased simulations, which may not be trivial.”

      Analysis of MDP files: Were these standard equilibrium MD or can you distinguish biased MD or free energy calculations?

      Currently we do not distinguish between biased and unbiased MD, but in the future we may attempt to do so, e.g. by correlating it with standard equilibration force-fields/parameters, timesteps or similar. Nevertheless, a true distinction will remain challenging.

      L336: typo: pikes -> spikes (or peaks?)

      “…simulations of Lennard-Jones models (Jeon et al., 2016). Interestingly, we noticed the appearance of several pikes at 400K, 600K and 800K, which were not present before the end of the year 2022. These peaks correspond to the same study related to the stability of hydrated crystals (Dybeck et al., 2023)’ Overall, thhis analysis revealed that a wide range of temperatures have been explored,…”

      Thank you. We have corrected this typo.

      Make clear how multiple versions of data sets are handled, e.g., if v1, v2, and v3 of a dataset are provided in Zenodo then which one is counted or are all counted?

      We collected the latest version only of datasets, as exposed by default by the Zenodo API. To reflect this, we added the following sentence to the Methods and Materials section, Initial data collection sub-section:

      “By default, the last version of the datasets was collected.”

      L248 Analysis of GRO files seems fairly narrow because PDB files are very often used for exactly the same purpose, even in the context of Gromacs simulations, not the least because it is familiar to structural biologists that may be interested in representative MD snapshots. Despite all the shortcomings of abusing the PDB format for MD, it is an attempt at increased interoperability. Perhaps the authors can make sure that readers understand that choosing GRO for analysis may give a somewhat skewed picture, even within Gromacs simulations.

      Thanks for this comment. We collected about 12,000 PDB files that could indeed be output from Gromacs simulations and easily be shared due to the universality of this format, but that could as well come from different sources (like other MD packages or the PDB database itself). We purposely decided to limit our study to files strictly associated with the Gromacs package, like MDP and XTC file types. However, we will extend our survey to all other structure-like formats and especially the PDB file type. We reflected this purpose in the following sentence (after line 281)

      “Beyond .gro files, we would like to analyze the ensemble of the ~12,000 .pdb files extracted in this study (see Figure 2-B) to better characterize the types of molecular structures deposited.”

      A simple template metadata file would be welcome (e.g., served from a GitHub/GitLab repository so that it can be improved with community input).

      Thank you for this suggestion that we fundamentally agree with. However, the generation of such a file is a major task, and we believe that the creation of a metadata file template requires far-reaching considerations, therefore is beyond the scope of this paper and should not be decided by a small group of researchers. Indeed, this topic requires a large consensus of different stakeholders, from users, to MD program developers, and journal editors. It would be especially useful to organize dedicated workshops with representatives of all these communities to tackle this specific issue, as mentioned by Reviewer3 in his/her public review. As a basis for this discussion, we humbly proposed at the end of this manuscript a few non-constraining guidelines based on our experience retrieving the data.

      To emphasize this statement, we added the following sentence at the end of the “Guidelines for better sharing of MD simulation data” section (line 420):

      “Converging on a set of metadata and format requires a large consensus of different stakeholders from users, to MD program developers, and journal editors. It would be especially useful to organize specific workshops with representatives of all these communities to collectively tackle this specific issue.”

      In "Data and code availability" it would be good to specify licenses in addition to stating "open source". Thank you for pointing out that GitLab/GitHub are not archives and that everyone should be strongly encouraged to submit data to stable archival repositories.

      We added the corresponding licenses for code and data in the “Data and code availability” section.

      Reviewer #3 (Recommendations For The Authors)

      The paper is well written, with very few typographical or other minor errors.

      Minor points:

      Line 468-9 "can evolve being more user-friendly" should be "can evolve to being more user-friendly", I think.

      Thank you, we have changed the wording accordingly.

    1. prerequisite

      Replication--getting the same result from doing the whole study again--is logically independent from whether there even exist any code/analysis from the original paper. Recommend revising and instead talking about how they are related and often conflated.

    1. Reviewer #3 (Public Review):

      [Editors' note: This review contains many criticisms that apply to the whole sub-field of slow/fast gamma oscillations in the hippocampus, as opposed to this particular paper. In the editors' view, these comments are beyond the scope of any single paper. However, they represent a view that, if true, should contextualise the interpretation of this paper and all papers in the sub-field. In doing so, they highlight an ongoing debate within the broader field.]

      Summary:

      The authors aimed to elucidate the role of dynamic gamma modulation in the development of hippocampal theta sequences, utilizing the traditional framework of "two gammas," a slow and a fast rhythm. This framework is currently being challenged, necessitating further analyses to establish and secure the assumed premises before substantiating the claims made in the present article.

      The results are too preliminary and need to integrate contemporary literature. New analyses are required to address these concerns. However, by addressing these issues, it may be possible to produce an impactful manuscript.

      I. Introduction<br /> Within the introduction, multiple broad assertions are conveyed that serve as the premise for the research. However, equally important citations that are not mentioned potentially contradict the ideas that serve as the foundation. Instances of these are described below:

      (1) Are there multiple gammas? The authors launched the study on the premise that two different gamma bands are communicated from CA3 and the entorhinal cortex. However, recent literature suggests otherwise, offering that the slow gamma component may be related to theta harmonics:

      From a review by Etter, Carmichael and Williams (2023)<br /> "Gamma-based coherence has been a prominent model for communication across the hippocampal-entorhinal circuit and has classically focused on slow and fast gamma oscillations originating in CA3 and medial entorhinal cortex, respectively. These two distinct gammas are then hypothesized to be integrated into hippocampal CA1 with theta oscillations on a cycle-to-cycle basis (Colgin et al., 2009; Schomburg et al., 2014). This would suggest that theta oscillations in CA1 could serve to partition temporal windows that enable the integration of inputs from these upstream regions using alternating gamma waves (Vinck et al., 2023). However, these models have largely been based on correlations between shifting CA3 and medial entorhinal cortex to CA1 coherence in theta and gamma bands. In vivo, excitatory inputs from the entorhinal cortex to the dentate gyrus are most coherent in the theta band, while gamma oscillations would be generated locally from presumed local inhibitory inputs (Pernía-Andrade and Jonas, 2014). This predominance of theta over gamma coherence has also been reported between hippocampal CA1 and the medial entorhinal cortex (Zhou et al., 2022). Another potential pitfall in the communication-through-coherence hypothesis is that theta oscillations harmonics could overlap with higher frequency bands (Czurkó et al., 1999; Terrazas et al., 2005), including slow gamma (Petersen and Buzsáki, 2020). The asymmetry of theta oscillations (Belluscio et al., 2012) can lead to harmonics that extend into the slow gamma range (Scheffer-Teixeira and Tort, 2016), which may lead to a misattribution as to the origin of slow-gamma coherence and the degree of spike modulation in the gamma range during movement (Zhou et al., 2019)."

      And from Benjamin Griffiths and Ole Jensen (2023)<br /> "That said, in both rodent and human studies, measurements of 'slow' gamma oscillations may be susceptible to distortion by theta harmonics [53], meaning open questions remain about what can be attributed to 'slow' gamma oscillations and what is attributable to theta."

      This second statement should be heavily considered as it is from one of the original authors who reported the existence of slow gamma.

      Yet another instance from Schomburg, Fernández-Ruiz, Mizuseki, Berényi, Anastassiou, Christof Koch, and Buzsáki (2014):<br /> "Note that modulation from 20-30 Hz may not be related to gamma activity but, instead, reflect timing relationships with non-sinusoidal features of theta waves (Belluscio et al., 2012) and/or the 3rd theta harmonic."

      One of this manuscript's authors is Fernández-Ruiz, a contemporary proponent of the multiple gamma theory. Thus, the modulation to slow gamma offered in the present manuscript may actually be related to theta harmonics.

      With the above emphasis from proponents of the slow/fast gamma theory on disambiguating harmonics from slow gamma, our first suggestion to the authors is that they A) address these statements (citing the work of these authors in their manuscript) and B) demonstrably quantify theta harmonics in relation to slow gamma prior to making assertions of phase relationships (methodological suggestions below). As the frequency of theta harmonics can extend as high as 56 Hz (PMID: 32297752), overlapping with the slow gamma range defined here (25-45 Hz), it will be important to establish an approach that decouples the two phenomena using an approach other than an arbitrary frequency boundary.

      (2) Can gammas be segregated into different lamina of the hippocampus? This idea appears to be foundational in the premise of the research but is also undergoing revision.

      As discussed by Etter et al. above, the initial theory of gamma routing was launched on coherence values. However, the values reported by Colgin et al. (2009) lean more towards incoherence (a value of 0) rather than coherence (1), suggesting a weak to negligible interaction. Nevertheless, this theory is coupled with the idea that the different gamma frequencies are exclusive to the specific lamina of the hippocampus.

      Recently, Deschamps et al. (2024) suggested a broader, more nuanced understanding of gamma oscillations than previously thought, emphasizing their wide range and variability across hippocampal layers. This perspective challenges the traditional dichotomy of gamma sub-bands (e.g., slow vs. medium gamma) and their associated cognitive functions based on a more rigid classification according to frequency and phase relative to the theta rhythm. Moreover, they observed all frequencies across all layers.

      Similarly, the current source density plots from Belluscio et al. (2012) suggest that SG and FG can be observed in both the radiatum and lacunosum-moleculare.

      Therefore, if the initial coherence values are weak to negligible and both slow and fast gamma are observed in all layers of the hippocampus, can the different gammas be exclusively related to either anatomical inputs or psychological functions (as done in the present manuscript)? Do these observations challenge the authors' premise of their research? At the least, please discuss.

      (3) Do place cells, phase precession, and theta sequences require input from afferent regions? It is offered in the introduction that "Fast gamma (~65-100Hz), associated with the input from the medial entorhinal cortex, is thought to rapidly encode ongoing novel information in the context (Fernandez-Ruiz et al., 2021; Kemere, Carr, Karlsson, & Frank, 2013; Zheng et al., 2016)".

      CA1 place fields remain fairly intact following MEC inactivation include Ipshita Zutshi, Manuel Valero, Antonio Fernández-Ruiz , and György Buzsáki (2022)- "CA1 place cells and assemblies persist despite combined mEC and CA3 silencing" and from Hadas E Sloin, Lidor Spivak, Amir Levi, Roni Gattegno, Shirly Someck, Eran Stark (2024) - "These findings are incompatible with precession models based on inheritance, dual-input, spreading activation, inhibition-excitation summation, or somato-dendritic competition. Thus, a precession generator resides locally within CA1."

      These publications, at the least, challenge the inheritance model by which the afferent input controls CA1 place field spike timing. The research premise offered by the authors is couched in the logic of inheritance, when the effect that the authors are observing could be governed by local intrinsic activity (e.g., phase precession and gamma are locally generated, and the attribution to routed input is perhaps erroneous). Certainly, it is worth discussing these manuscripts in the context of the present manuscript.

      II. Results

      (1) Figure 2-<br /> a. There is a bit of a puzzle here that should be discussed. If slow and fast frequencies modulate 25% of neurons, how can these rhythms serve as mechanisms of communication/support psychological functions? For instance, if fast gamma is engaged in rapid encoding (line 72) and slow gamma is related to the integration processing of learned information (line 84), and these are functions of the hippocampus, then why do these rhythms modulate so few cells? Is this to say 75% of CA1 neurons do not listen to CA3 or MEC input?

      b. Figure 2. It is hard to know if the mean vector lengths presented are large or small. Moreover, one can expect to find significance due to chance. For instance, it is challenging to find a frequency in which modulation strength is zero (please see Figure 4 of PMID: 30428340 or Figure 7 of PMID: 31324673).

      i. Please construct the histograms of Mean Vector Length as in the above papers, using 1 Hz filter steps from 1-120Hz and include it as part of Figure 2 (i.e., calculate the mean vector length for the filtered LFP in steps of 1-2 Hz, 2-3 Hz, 3-4 Hz,... etc). This should help the authors portray the amount of modulation these neurons have relative to the theta rhythm and other frequencies. If the theta mean vector length is higher, should it be considered the primary modulatory influence of these neurons (with slow and fast gammas as a minor influence)?

      ii. It is possible to infer a neuron's degree of oscillatory modulation without using the LFP. For instance, one can create an ISI histogram as done in Figure 1 here (https://www.biorxiv.org/content/10.1101/2021.09.20.461152v3.full.pdf+html; "Distinct ground state and activated state modes of firing in forebrain neurons"). The reciprocal of the ISI values would be "instantaneous spike frequency". In favor of the Douchamps et al. (2024) results, the figure of the BioRXiV paper implies that there is a single gamma frequency modulate as there is only a single bump in the ISIs in the 10^-1.5 to 10^-2 range. Therefore, to vet the slow gamma results and the premise of two gammas offered in the introduction, it would be worth including this analysis as part of Figure 2.

      c. There are some things generally concerning about Figure 2.

      i. First, the raw trace does not seem to have clear theta epochs (it is challenging to ascertain the start and end of a theta cycle). Certainly, it would be worth highlighting the relationship between theta and the gammas and picking a nice theta epoch.

      ii. Also, in panel A, there looks to be a declining amplitude relationship between the raw, fast, and slow gamma traces, assuming that the scale bars represent 100uV in all three traces. The raw trace is significantly larger than the fast gamma. However, this relationship does not seem to be the case in panel B (in which both the raw and unfiltered examples of slow and fast gamma appear to be equal; the right panels of B suggest that fast gamma is larger than slow, appearing to contradict the A= 1/f organization of the power spectral density). Please explain as to why this occurs. Including the power spectral density (see below) should resolve some of this.

      iii. Within the example of spiking to phase in the left side of Panel B (fast gamma example)- the neuron appears to fire near the trough twice, near the peak twice, and somewhere in between once. A similar relationship is observed for the slow gamma epoch. One would conclude from these plots that the interaction of the neuron with the two rhythms is the same. However, the mean vector lengths and histograms below these plots suggest a different story in which the neuron is modulated by FG but not SG. Please reconcile this.

      iv. For calculating the MVL, it seems that the number of spikes that the neuron fires would play a significant role. Working towards our next point, there may be a bias of finding a relationship if there are too few spikes (spurious clustering due to sparse data) and/or higher coupling values for higher firing rate cells (cells with higher firing rates will clearly show a relationship), forming a sort of inverse Yerkes-Dodson curve. Also, without understanding the magnitude of the MVL relative to other frequencies, it may be that these values are indeed larger than zero, but not biologically significant.

      - Please provide a scatter plot of Neuron MVL versus the Neuron's Firing Rate for 1) theta (7-9 Hz), 2) slow gamma, and 3) fast gamma, along with their line of best fit.

      - Please run a shuffle control where the LFP trace is shifted by random values between 125-1000ms and recalculate the MVL for theta, slow, and fast gamma. Often, these shuffle controls are done between 100-1000 times (see cross-correlation analyses of Fujisawa, Buzsaki et al.).

      - To establish that firing rate does not play a role in uncovering modulation, it would be worth conducting a spike number control, reducing the number of spikes per cell so that they are all equal before calculating the phase plots/MVL.

      (2) Something that I anticipated to see addressed in the manuscript was the study from Grosmark and Buzsaki (2016): "Cell assembly sequences during learning are "replayed" during hippocampal ripples and contribute to the consolidation of episodic memories. However, neuronal sequences may also reflect preexisting dynamics. We report that sequences of place-cell firing in a novel environment are formed from a combination of the contributions of a rigid, predominantly fast-firing subset of pyramidal neurons with low spatial specificity and limited change across sleep-experience-sleep and a slow-firing plastic subset. Slow-firing cells, rather than fast-firing cells, gained high place specificity during exploration, elevated their association with ripples, and showed increased bursting and temporal coactivation during postexperience sleep. Thus, slow- and fast-firing neurons, although forming a continuous distribution, have different coding and plastic properties."

      My concern is that much of the reported results in the present manuscript appear to recapitulate the observations of Grosmark and Buzsaki, but without accounting for differences in firing rate. A parsimonious alternative explanation for what is observed in the present manuscript is that high firing rate neurons, more integrated into the local network and orchestrating local gamma activity (PING), exhibit more coupling to theta and gamma. In this alternative perspective, it's not something special about how the neurons are entrained to the routed fast gamma, but that the higher firing rate neurons are better able to engage and entrain their local interneurons and, thus modulate local gamma. However, this interpretation challenges the discussion around the importance of fast gamma routed from the MEC.

      a. Please integrate the Grosmark & Buzsaki paper into the discussion.

      b. Also, please provide data that refutes or supports the alternative hypothesis in which the high firing rate cells are just more gamma modulated as they orchestrate local gamma activity through monosynaptic connections with local interneurons (e.g., Marshall et al., 2002, Hippocampal pyramidal cell-interneuron spike transmission is frequency dependent and responsible for place modulation of interneuron discharge). Otherwise, the attribution to a MEC routed fast gamma routing seems tenuous.<br /> c. It is mentioned that fast-spiking interneurons were removed from the analysis. It would be worth including these cells, calculating the MVL in 1 Hz increments as well as the reciprocal of their ISIs (described above).

      (3) Methods - Spectral decomposition and Theta Harmonics.

      a. It is challenging to interpret the exact parameters that the authors used for their multi-taper analysis in the methods (lines 516-526). Tallon-Baudry et al., (1997; Oscillatory γ-Band (30-70 Hz) Activity Induced by a Visual Search Task in Humans) discuss a time-frequency trade-off where frequency resolution changes with different temporal windows of analysis. This trade-off between time and frequency resolution is well known as the uncertainty principle of signal analysis, transcending all decomposition methods. It is not only a function of wavelet or FFT, and multi-tapers do not directly address this. (The multitaper method, by using multiple specially designed tapers -like the Slepian sequences- smooths the spectrum. This smoothing doesn't eliminate leakage but distributes its impact across multiple estimates). Given the brevity of methods and the issues of theta harmonics as offered above, it is worth including some benchmark trace testing for the multi-taper as part of the supplemental figures.

      i. Please spectrally decompose an asymmetric 8 Hz sawtooth wave showing the trace and the related power spectral density using the multiple taper method discussed in the methods.

      ii. Please also do the same for an elliptical oscillation (perfectly symmetrical waves, but also capable of casting harmonics). Matlab code on how to generate this time series is provided below:<br /> A = 1; % Amplitude<br /> T = 1/8; % Period corresponding to 8 Hz frequency<br /> omega = 2*pi/T; % Angular frequency<br /> C = 1; % Wave speed<br /> m = 0.9; % Modulus for the elliptic function (0<br /> x = linspace(0, 2*pi, 1000); % temporal domain<br /> t = 0; % Time instant

      % Calculate B based on frequency and speed<br /> B = sqrt(omega/C);

      % Cnoidal wave equation using the Jacobi elliptic function<br /> u = A .* ellipj(B.*(x - C*t), m).^2;

      % Plotting the cnoidal wave<br /> figure;<br /> plot(x./max(x), u);<br /> title('8 Hz Cnoidal Wave');<br /> xlabel('time (x)');<br /> ylabel('Wave amplitude (u)');<br /> grid on;

      The Symbolic Math Toolbox needs to be installed and accessible in your MATLAB environment to use ellipj. Otherwise, I trust that, rather than plotting a periodic orbit around a circle (sin wave) the authors can trace the movement around an ellipse with significant eccentricity (the distance between the two foci should be twice the distance between the co-vertices).

      iii. Line 522: "The power spectra across running speeds and absolute power spectrum (both results were not shown)...". Given the potential complications of multi-taper discussed above, and as each convolution further removes one from the raw data, it would be the most transparent, simple, and straightforward to provide power spectra using the simple fft.m code in Matlab (We imagine that the authors will agree that the results should be robust against different spectral decomposition methods. Otherwise, it is concerning that the results depend on the algorithm implemented and should be discussed. If gamma transience is a concern, the authors should trigger to 2-second epochs in which slow/fast gamma exceeds 3-7 std. dev. above the mean, comparing those resulting power spectra to 2-second epochs with ripples - also a transient event). The time series should be at least 2 seconds in length (to avoid spectral leakage issues and the issues discussed in Talon-Baudry et al., 1997 above).

      Please show the unmolested power spectra (Y-axis units in mV2/Hz, X-axis units as Hz) as a function of running speed (increments of 5 cm/s) for each animal. I imagine three of these PSDs for 3 of the animals will appear in supplemental methods while one will serve as a nice manuscript figure. With this plot, please highlight the regions that the authors are describing as theta, slow, and fast gamma. Also, any issues should be addressed should there be notable differences in power across animals or tetrodes (issues with locations along proximal-distal CA1 in terms of MEC/LEC input and using a local reference electrode are discussed below).

      iv. Schomberg and colleagues (2014) suggested that the modulation of neurons in the slow gamma range could be related to theta harmonics (see above). Harmonics can often extend in a near infinite as they regress into the 1/f background (contributing to power, but without a peak above the power spectral density slope), making arbitrary frequency limits inappropriate. Therefore, in order to support the analyses and assertions regarding slow gamma, it seems necessary to calculate a "theta harmonic/slow gamma ratio". Aru et al. (2015; Untangling cross-frequency coupling in neuroscience) offer that: " The presence of harmonics in the signal should be tested by a bicoherence analysis and its contribution to CFC should be discussed." Please test both the synthetic signals above and the raw LFP, using temporal windows of greater than 4 seconds (again, the large window optimizes for frequency resolution in the time-frequency trade-off) to calculate the bicoherence. As harmonics are integers of theta coupled to itself and slow gamma is also coupled to theta, a nice illustration and contribution to the field would be a method that uses the bispectrum to isolate and create a "slow gamma/harmonic" ratio.

      (4) I appreciate the inclusion of the histology for the 4 animals. Knerim and colleagues describe a difference in MEC projection along the proximal-distal axis of the CA1 region (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866456/)- "There are also differences in their direct projections along the transverse axis of CA1, as the LEC innervates the region of CA1 closer to the subiculum (distal CA1), whereas the MEC innervates the region of CA1 closer to CA2 and CA3 (proximal CA1)" From the histology, it looks like some of the electrodes are in the part of CA1 that would be dominated by LEC input while a few are closer to where the MEC would project.

      a. How do the authors control for these differences in projections? Wouldn't this change whether or not fast gamma is observed in CA1?

      b. I am only aware of one manuscript that describes slow gamma in the LEC which appeared in contrast to fast gamma from the MEC (https://www.science.org/doi/10.1126/science.abf3119). One would surmise that the authors in the present manuscript would have varying levels of fast gamma in their CA1 recordings depending on the location of the electrodes in the Proximal-distal axis, to the extent that some of the more medial tetrodes may need to be excluded (as they should not have fast gamma, rather they should be exclusively dominated by slow gamma). Alternatively, the authors may find that there is equal fast gamma power across the entire proximal-distal axis. However, this would pose a significant challenge to the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz et al. and require reconciliation/discussion.

      c. Is there a difference in neuron modulation to these frequencies based on electrode location in CA1?

      (5) Given a comment in the discussion (see below), it will be worth exploring changes in theta, theta harmonic, slow gamma, and fast gamma power with running speed as no changes were observed with theta sequences or lap number versus. Notably, Czurko et al., report an increase in theta and harmonic power with running speed (1999) while Ahmed and Mehta (2012) report a similar effect for gamma.

      a. Please determine if the oscillations change in power and frequency of the rhythms discussed above change with running speed using the same parameters applied in the present manuscript. The specific concern is that how the authors calculate running speed is not sensitive enough to evaluate changes.

      b. It is astounding that animals ran as fast as they did in what appears to be the first lap (Figure 3F), especially as rats' natural proclivity is thigmotaxis and inquisitive exploration in novel environments. Can the authors expand on why they believe their rats ran so quickly on the first lap in a novel environment and how to replicate this? Also, please include the individual values for each animal on the same plot.

      c. Can the authors explain how the statistics on line 169 (F(4,44)) work? Specifically, it is challenging to determine how the degrees of freedom were calculated in this case and throughout if there were only 4 animals (reported in methods) over 5 laps (depicted in Figure 3F. Given line 439, it looks like trials and laps are used synonymously). Four animals over 5 laps should have a DOF of 16.

      (6) Throughout the manuscript, I am concerned about an inflation of statistical power. For example on line 162, F(2,4844). The large degrees of freedom indicate that the sample size was theta sequences or a number of cells. Since multiple observations were obtained from the same animal, the statistical assumption of independence is violated. Therefore, the stats need to be conducted using a nested model as described in Aarts et al. (2014; https://pubmed.ncbi.nlm.nih.gov/24671065/). A statistical consult may be warranted.

      (7) It is stated that one tetrode served as a quiet recording reference. The "quiet" part is an assumption when often, theta and gamma can be volume conducted to the cortex (e.g., Sirota et al., 2008; This is often why laboratories that study hippocampal rhythms use the cerebellum for the differential recording electrode and not an electrode in the corpus callosum). Generally, high frequencies propagate as well as low frequencies in the extracellular milieu (https://www.eneuro.org/content/4/1/ENEURO.0291-16.2016). For transparency, the authors should include a limitation paragraph in their discussion that describes how their local tetrode reference may be inadvertently diminishing and/or distorting the signal that they are trying to isolate. Otherwise, it would be worth hearing an explanation as to how the author's approach avoids this issue.

      Apologetically, this review is already getting long. Moreover, I have substantial concerns that should be resolved prior to delving into the remainder of the analyses. e.g., the analyses related to Figure 3-5 assert that FG cells are important for sequences. However, the relationship to gamma may be secondary to either their relationship to theta or, based on the Grosmark and Buzsaki paper, it may just be a phenomenon coupled to the fast-firing cells (fast-firing cells showing higher gamma modulation due to a local PING dynamic). Moreover, the observation of slow gamma is being challenged as theta harmonics, even by the major proponents of the slow/fast gamma theory. Therefore, the report of slow gamma precession would come as an unsurprising extension should they be revealed to be theta harmonics (however, no control for harmonics was implemented; suggestions were made above). Following these amendments, I would be grateful for the opportunity to provide further feedback.

      III. Discussion.

      a. Line 330- it was offered that fast gamma encodes information while slow gamma integrates in the introduction. However, in a task such as circular track running (from the methods, it appears that there is no new information to be acquired within a trial), one would guess that after the first few laps, slow gamma would be the dominant rhythm. Therefore, one must wonder why there are so few neurons modulated by slow gamma (~3.7%).

      b. Line 375: The authors contend that: "...slow gamma, related to information compression, was also required to modulate fast gamma phase-locked cells during sequence development. We replicated the results of slow gamma phase precession at the ensemble level (Zheng et al., 2016), and furthermore observed it at late development, but not early development, of theta sequences." In relation to the idea that slow gamma may be coupled to - if not a distorted representation of - theta harmonics, it has been observed that there are changes in theta relative to novelty.

      i. A. Jeewajee, C. Lever, S. Burton, J. O'Keefe, and N. Burgess (2008) report a decrease in theta frequency in novel circumstances that disappears with increasing familiarity.

      ii. One could surmise that this change in frequency is associated with alterations in theta harmonics (observed here as slow gamma), challenging the author's interpretation.

      iii. Therefore, the authors have a compelling opportunity to replicate the results of Jeewajee et al., characterizing changes of theta along with the development of slow gamma precession, as the environment becomes familiar. It will become important to demonstrate, using bicoherence as offered by Aru et al., how slow gamma can be disambiguated from theta harmonics. Specifically, we anticipate that the authors will be able to quantify A) theta harmonics (the number, and their respective frequencies and amplitudes), B) the frequency and amplitude of slow gamma, and C) how they can be quantitatively decoupled. Through this, their discussion of oscillatory changes with novelty-familiarity will garner a significant impact.

      c. Broadly, it is interesting that the authors emphasize the gamma frequency throughout the discussion. Given that the power spectral density of the Local Field Potential (LFP) exhibits a log-log relationship between amplitude and frequency, as described by Buzsáki (2005) in "Rhythms of the Brain," and considering that the LFP is primarily generated through synaptic transmembrane currents (Buzsáki et al., 2012), it seems parsimonious to consider that the bulk of synaptic activity occurs at lower frequencies (e.g., theta). Since synaptic transmission represents the most direct form of inter-regional communication, one might wonder why gamma (characterized by lower amplitude rhythms) is esteemed so highly compared to the higher amplitude theta rhythm. Why isn't the theta rhythm, instead, regarded as the primary mode of communication across brain regions? A discussion exploring this question would be beneficial.

    1. Reviewer #2 (Public Review):

      In their paper entitled "Molecular, Cellular, and Developmental Organization of the Mouse Vomeronasal Organ at Single Cell Resolution" Hills Jr. et al. perform single-cell transcriptomic profiling and analyze tissue distribution of a large number of transcripts in the mouse vomeronasal organ (VNO). The use of these complementary tools provides a robust approach to investigating many aspects of vomeronasal sensory neuron (VSN) biology based on transcriptomics. Harnessing the power of these techniques, the authors present the discovery of previously unidentified sensory neuron types in the mouse VNO. Furthermore, they report co-expression of chemosensory receptors from different clades on individual neurons, including the co-expression of VR and OR. Finally, they evaluated the correlation between transcription factor expression and putative surface axon guidance molecules during the development of different neuronal lineages. Based on such correlation analysis, authors further propose a putative cascade of events that could give rise to different neuronal lineages and morphological organization.

      Taken together, Hills Jr. et al. present findings on (a) cell types in the VNO, (b) novel classes of sensory neurons, (c) developmental trajectories of the neuronal linage, (d) receptor expression in VSNs, (e) co-expression of chemosensory receptors, (f) a surface molecule code for individual receptor types, and (g) transcriptional regulation of receptor and axon guidance cues. Before outlining the major strengths and weaknesses of the manuscript, we need to disclose that, while we are comfortable reviewing aspects (a) to (e) of their work, we lack the expertise to provide constructive criticism on the two last points (f) and (g). Thus, we will not comment on these.

      In general, interpretations/claims put forward by Hills Jr. et al. appear striking at first glance. Upon careful review of the manuscript, however, it becomes apparent that many of the groundbreaking discoveries lack compelling support. Several (not all) of the results presented in this work lack novelty, accurate interpretability, and corroboration. A recurrent theme throughout the manuscript is an incomplete, and somewhat misleading account of the current knowledge in the field. This is perhaps most apparent in the introductory paragraphs, where the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here. For example: "...In contrast, transcriptomic information of the VNO is rather limited (Ref 24,25)...". Indeed, transcriptomic information of the mouse VNO is limited. Here, however, the authors ignore recent reports of robust single-cell transcriptomic analysis from adult and juvenile mice. These papers are, in part, cited later in this manuscript (ref 88, 89, 90, 91), or are completely missing (doi.org/10.7554/eLife.77259). Regardless, previously published results on the same topics have to be included in the Introduction to put the background and novelty of the findings into perspective.

      General comments on (a) cell types in the VNO

      The authors performed single-cell transcriptomic analysis of a large number of cells from both adult and juvenile VNO, creating the largest dataset of its kind to date. This dataset contains a wealth of information and, once made public, could be a valuable resource to the community. However, the analysis implemented in this paper raises several questions:

      Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?

      The authors should report the marker genes used for cell annotation. This is important for data validation, comparison with other publicly available datasets, as well as future use of this dataset.<br /> The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.

      "Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone." This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.

      General comments on (b) novel classes of sensory neurons

      The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.

      The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?

      Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).

      Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?

      When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?

      The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins...

      General comments on (c) developmental trajectories of the neuronal linage

      The authors evaluated a possible cascade of events leading to the development of different lineages of mature sensory neurons using GBCs as a starting point. They found the differential expression of several transcription factors at different stages of development. This analysis was performed correctly, and its interpretation is coherent. However, it is mysterious why the authors included only classical V1R and V2R-expressing neurons, while the novel sensory neurons, cOSN and sVSN, were not included. Furthermore, it is important to notice again the misreport of previously published works.

      The authors wrote "...the transcriptomic landscape that specifies the lineages is not known...". This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.

      General comments on (d) receptor expression in VSNs

      The authors evaluated the expression of chemosensory receptors in the VNO and correlated receptor expression with the expression of transcription factors. The analysis of such correlation showed that, while the expression of V1Rs is mainly correlated with the expression of the transcription factor Meis2, the expression of V2Rs is correlated with the combination of many transcription factors. These results are interesting, however, the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.

      This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.

      General comments on (e) co-expression of chemosensory receptors

      The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. Molecular Cartography is a powerful tool and the reported data in this work is truly interesting. The authors show some clear confirmation of previously reported V2R co-expression (Figure 5H), and new co-expression of chemosensory receptors including V1R, V2R, and Fpr (Figure 5G-K).

      However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.

      It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?

    1. "Le chef d'établissement a l'obligation de transmettre à la commission d'appel les décisions motivées ainsi que tous les éléments susceptibles de l'éclairer (Article D.331-35 - Code de l'éducation)." Cela signifie qu'une décision insuffisamment motivée du chef d'établissement peut être annulée.
    2. Toute décision d'orientation post-3e non conforme à la demande de la famille doit obligatoirement être motivée. Le chef d'établissement est tenu de mettre en avant des éléments objectifs sur lesquels repose sa décision. "Les motivations comportent des éléments objectifs ayant fondé les décisions, en termes de connaissances, de capacités et d'intérêts (Article D331-34 - Code de l'éducation)."
    1. Le chef d'établissement convoque dans les mêmes formes, en application de l'article D. 511-31 du code de l'Éducation, l'élève et son représentant légal s'il est mineur, la personne éventuellement chargée d'assister l'élève pour présenter sa défense, la personne ayant demandé au chef d'établissement la comparution de celui-ci et, enfin, les témoins ou les personnes susceptibles d'éclairer le conseil sur les faits motivant la comparution de l'élève
    1. Résumé de la vidéo [00:00:02][^1^][1] - [00:48:41][^2^][2]:

      Cette vidéo présente le framework Observable pour créer des tableaux de bord, des rapports et des applications web de manière efficace et gratuite. Elle explique comment utiliser Observable pour documenter des fonctionnalités, introduit le concept de Data loader pour rafraîchir les données, et montre comment intégrer des réalisations Observable dans un site web statique.

      Points forts: + [00:00:08][^3^][3] Introduction à Observable * Présentation du framework Observable comme générateur de site statique gratuit et open source * Utilisation de Markdown et JavaScript pour la documentation * Hébergement gratuit sur des plateformes comme GitHub Pages + [00:01:36][^4^][4] Spécialisation pour les tableaux de bord * Observable est spécialisé pour les applications nécessitant un rafraîchissement régulier des données * Introduction du concept de Data loader pour une mise à jour périodique des données * Création de sites web statiques capables de rafraîchir leurs données efficacement + [00:03:00][^5^][5] Développement JavaScript avec Observable * Observable comme environnement de développement JavaScript unique avec réactivité entre déclarations * Explication de la réactivité et de la dépendance des variables dans Observable * Utilisation de Markdown, LaTeX et JavaScript pour créer des contenus interactifs + [00:10:13][^6^][6] Utilisation de bibliothèques et gestion de versions * Observable permet d'appeler des bibliothèques externes et contient un gestionnaire de versions simplifié * Partage et publication de classeurs pour la collaboration et la réutilisation * Exemples de tutoriels et de cours disponibles sur Observable + [00:24:26][^7^][7] Démarrage avec le framework * Processus de création, d'édition et de prévisualisation d'un site avec Observable * Utilisation de GitHub Actions pour le rafraîchissement automatique des données * Intégration d'animations et de visualisations dans un site web statique + [00:40:15][^8^][8] Exemples d'applications créées avec Observable * Présentation d'applications variées, telles que l'évolution des joueurs d'échecs et un tableau de bord d'hôtel * Conversion d'une application JavaScript existante en une version améliorée avec Observable

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewer for their careful evaluation and constructive criticisms of our manuscript. We also appreciate the positive review by all three reviewers. The reviewers noted:

      • "The computational model in this manuscript can be a tool to discover unknown molecular pathways interactions in cardiomyocyte proliferation."
      • "This is an interesting study reporting the generation of a computational model of cardiomyocyte proliferation, which predicts molecular drivers of cell cycle progression."
      • "The model provides a convenient systems framework to prioritize potential signaling drivers of therapeutic modulators of cardiomyocyte proliferation." We have responded to all reviewer comments and have outlined the corresponding additions and changes to the manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      In the manuscript by Harris et al. titled "Dynamic map illuminates Hippo to cMyc module crosstalk driving cardiomyocyte proliferation," the authors developed a computational model of cardiac proliferation signaling that incorporates various regulatory networks (cytokinesis, mitosis, DNA replication, etc.) to predict molecular drivers (genes) that support cardiomyocyte proliferation. Published research articles on cardiomyocyte proliferation in multiple contexts (different species, ages, in vitro and in vivo, etc) were used to build and validate the computational model. The authors found using their model that different processes during cardiomyocyte proliferation may or may not be context-dependent. For example, DNA replication is regulated differently in conditions with high Neuregulin compared to high YAP, whereas mitosis and cytokinesis regulation is similar in these conditions. To experimentally validate their model, the authors used an in vitro system to test the effects of YAP on 3 connected pathways; in the context of YAP activation, inhibition of PI3K, cMyc, or FoxM1 was combined to assay cell-cycle markers in cultured neonatal rat ventricular cardiomyocytes. Cell-cycle marker expression in cardiomyocytes was attenuated by inhibition of cMyc or PI3K, suggesting that these pathways are involved in YAP-mediated cardiomyocyte proliferation. While this model can be a good tool to gain new insights on interactions between molecular pathways, there are a few questions to be addressed prior to publication.

      We appreciate the Reviewer's positive remarks about important findings in our manuscript and the ability of our model to be a tool to gain insights on interactions between molecular pathways to regulate cardiomyocyte proliferation. We have strived to address their points, as shown below.

      Major Comments:

      1. One of the potential uses for this computational model is to discover new interactions between known pathways that are involved in cardiomyocyte proliferation. However, this would be more powerful if factors such as species, age (neonate vs. adult), and experimental design (in vivo vs. in vitro) were accounted for, as new node inputs or a combination of existing node input activity values. This is very important because cardiomyocyte proliferation can drastically vary depending on these experimental factors. We agree that future extensions of this model accounting for species, age, and experimental design may enable an understanding of how these factors regulate proliferation. While this model's predictions are most relevant to immature cardiomyocytes, we note that it is the first systems model of the molecular network regulating cardiomyocyte proliferation. We extensively validated it against neonatal cardiomyocyte literature and then made new predictions regarding Hippo-cMyc pathways, which we validated in new cardiomyocyte experiments and against data in adult mice. This provides a strong foundation for future extensions. We now address this potential in the Discussion:

      "While our model's predictions are most relevant to immature cardiomyocytes, it is the first systems model of the molecular network regulating cardiomyocytes. In the future, we hope that we and others may extend this model to identify how factors like species, age, and experimental design regulate proliferation. However, these endeavors would span multiple manuscripts, and the field currently lacks sufficient stage-specific data. For example, a previous foundational computational model of cardiomyocyte electrophysiology (Luo and Rudy, Circ Res 1994) focused on adult guinea pigs. This model became the foundation for a range of developmental and species-specific models in electrophysiology (Tusscher et al, AJP 2004,; Courtemanche eta al, AJP 1998; Paci et al, ABME 2013). We believe the open availability of our code will enable similar dissemination and extension for additional factors." Line 651-661

      For reference:

      Luo and Rudy, Circ Res 1994, >2.1k citations; Tusscher et al, AJP 2004, >1.7k citations; Courtemanche et al, AJP 1998, 1.5K citations; Paci et al, ABME 2013, 147 citations

      The finding that cardiomyocyte proliferation is context-dependent is very exciting and warrants further investigation/validation. The authors state that different sets of nodes/modules are affected by neuregulin activation compared to YAP activation. This should be experimentally validated - qPCR/Western blots on sets of genes that are predicted to be differentially regulated in the high neuregulin context vs the high YAP context.

      We agree that the model's prediction of context-dependent cardiomyocyte proliferation is very exciting. To further validate these predictions, we have performed additional experiments to validate context-dependent changes of phospho-ERK treated with Nrg and TT10. Using a high throughput capillary electrophoresis western blot system, we observed that with a short treatment of 30 min, Nrg induces greater phosphor-ERK compared to TT10, which validates our model predictions at short time intervals. Additionally, the model predicted greater p-AKT with 30 min treatment with Nrg compared to TT10. To validate this prediction, we now compare to Western blots from Hara et al. examining p-AKT in Nrg and TT10-treated cells. Validating our model predictions, their data show that Nrg induces greater p-AKT than with TT10. We have added new panels C, D, and E to Figure 4.

      Figure 4: Influence of node knockdowns shifts with context, revealing crosstalk from Hippo to Growth Factor modules.

      (A) Total influence of node knockdowns on the DNA replication, mitosis, and cytokinesis modules, compared across multiple signaling contexts: baseline, high Nrg, and high YAP. Total influence sums the overall effect of a node knockdown on a network module. (B) The total influence of each network module varies depending on whether a basal state, high Nrg, or high YAP signaling context is applied. (C) Capillary electrophoresis western blot for phosphorylated ERK, beta-actin, and GAPDH from neonatal cardiomyocytes treated with Nrg or TT10 for 30 min. (D) Model predictions of AKT and ERK activity of acute response to Nrg or TT10 (time constants for gene expression set to 100). (E) Quantification of effects of Nrg or TT10 treatment on p-ERK (from Western blot in panel C, n = 3) or p-AKT (from Western blot from (Hara et al., 2018), n = 1).

      The overall description of the model can be improved. For example, how are the input and parameters set to validate or predict different experimental observations? What is the steady state activity of each of the nodes and does this make sense biologically? Including a few more sentences to explain the model would help with overall understanding for an uninformed reader.

      We have addressed the following questions provided by the reviewer in the methods and results section of the manuscript:

      How are the input and parameters set to validate or predict different experimental observations?

      __ __"At baseline, input reaction weight parameters (w) were set based on information from the literature describing the baseline state of these inputs in the heart (each input reaction weight can be found in Supplemental File 1). To simulate experiments with biochemical stimuli, input reaction weights were increased to 0.8 or 1. To simulate experiments with inhibition or knockdown, the corresponding maximum species value (ymax) was set to 0.1 or 0. Complete annotations for all validation simulations are provided in Supplemental File 2." Line 154-160

      What is the steady-state activity of each of the nodes and does this make sense biologically?

      "Steady-state activity of model nodes was obtained by running the model until there was a __ __

      Minor Comments:

      Line 124 - The use of "species" and "reactions" is confusing to uninformed readers. Do you mean nodes and interactions/bridges?

      We now further clarify these terms in the manuscript:

      "As in past network models (Zeigler et al., PMID 27017945; Tan et al., PMID 29131824; Kraeutler et al PMID 21087478), species (or nodes) refer to a small molecule, gene, protein, or process. Reactions (or edges) are activating or inhibiting relationships between network species." Line 143-146

      Line 130 - I could not find Supplementary File 2, which includes the references

      We apologize for the error. Supplementary File 2 references articles and resources used to build the model. These files are now attached.

      Line 257 - What is the meaning of the directional arrows in Fig 1A?

      We clarified the Fig 1A legend:

      "Arrows between modules represent one or more reactions that link species from one module to species in another module. " Line 594-595

      Line 301 - Unclear what default values mean here. Please elaborate and provide an example of how this is reasonable.

      We have added further descriptions of default values in reference to the parameters to the manuscript.

      "A previous study identified default values of the parameters (ymax, EC50, W, etc.) that most accurately predict the results of knockdown screens compared to a model where all biochemical parameters were measured experimentally (Kraeutler et al 2010). Subsequent studies started from these default values and further demonstrated that model accuracy was robust to random variation in the parameters (Tan et all 2017, Zeigler et al 2017). Consistent with these prior models, we performed robustness analysis that demonstrates that the CM proliferation model accuracy (compared against 78 experiments) is maintained at >80% with up to 35% variation in ymax, 30% variation with w, and a variation of >50% with EC50 (Figure S4)." Line 305-312


      Supplemental FigS2 - Why would knockdown of PKA, Lats1 or SMAD3 have the exact same effects on node activation? This is seen with multiple other genes was well (IGF and FGF for example).

      PKA, Lats1, and SMAD3 all inhibit cell cycle progression in part through cMyc. Therefore, their knockdown have similar effects on downstream signaling and proliferation. Similarly, IGF and FGF both stimulate Ras and PI3K via similar mechanisms, which is consistent with experimental studies of IGF- and FGF-dependent proliferation.


      Reviewer #1 (Significance (Required)):


      The computational model in this manuscript can be a tool to discover unknown molecular pathway interactions in cardiomyocyte proliferation. The novelty lies in the ability to adjust any parameter or the entire setting/context. While this sounds very exciting, improvement of the model to account for age, experimental conditions (in vivo vs in vitro), and species (human, pig, mouse) could lead to increase prediction accuracy. Additionally, more robust validation of context-dependent interactions between signaling pathways would also increase overall enthusiasm for the manuscript. Readers interested in a systems biology approach to cardiomyocyte proliferation, or researchers probing molecular interactions during cardiomyocyte proliferation would be interested in using such a model to discover novel contexts/combinations in which cardiomyocyte proliferation is more likely.


      The reviewer comes from a varied training background and is qualified to evaluate this manuscript in full - BS in biomedical engineering and mathematics. PhD in biomedical engineering (molecular biology, cardiac electrophysiology). Postdoctoral training in cardiac regeneration and immunity.


      We appreciate the positive comments about our model of the cardiomyocyte proliferation network. As described above, we believe that we have addressed the concerns with additional experimental validation.


      The manuscript submitted by Harris and colleagues collates a molecular map of cardiomyocyte cell cycle activation through mathematical modeling of previously published experimental results. They attempt to validate the constructed model several ways: 1) through testing results compiled from additional literature, 2) through in vitro analysis, and 3) through in vivo supporting data. When validating through additional literature the model proves quite reliable particularly for prediction of effects on synthesis, mitosis, and cytokinetic entry, but was less reliable (or insufficiently tested) at predicting completion of these stages as determined by polyploidization and multinucleation. A potentially novel observation which arose from the model - that hippo nodule connects to the growth factor nodule through PI3K, Myc, and FoxM1 - was partially confirmed with in vitro experiments, though a few experiments are warranted.

      We appreciate the reviewer's recognition of the important contributions of this model of the cardiomyocyte proliferation network. We have addressed the concerns below.

      Major comments:

      • The model is admittedly weakest in its handling of completion of cytokinesis resulting in new daughter cells (i.e. proliferation) versus failure to complete either M phase or cytokinesis resulting in the much more common cellular phenotypes - polyploidy and multinucleation. Notably, very few molecules were "tested" for this output (figure 2) and this proved the least reliable aspect of the model/map. I wonder if the authors consulted the literature on somatic polyploidization at all when building the model (files not provided as indicated, see minor comment 1 below )? And if not, would doing so help strengthen this arm of their map? There are some great reviews on the topic (see PMIDs 25921783, 23849927, 30021843) - while admittedly much of the work is done on other cell types (i.e. trophoblast giant cells and hepatocytes) maybe understanding the molecular intricacies in these cells could be incorporated to strengthen the predictive model in cardiomyocytes. Notably, PMID 23849927 even provides a table of citations about key nodes in the model influencing polyploidy. To validate this model, we used entirely cardiomyocyte specific studies. We appreciate the reviewer's reference to PMID 23849927, which enabled us to add two additional experiments to the validation table in Figure 2. That paper found that overexpression of either cMyc or cyclin D increases polyploidy, which both matched our new simulations in the updated Figure 2.

      Motivated by the reviewer's citation of PMID 23849927, we further validated the model against polyploidization data from multiple cell types, finding an 85.7% accuracy (6 of 7 experiments) as now shown in Supplementary Figure S7.

      We included an additional discussion of polyploidization in the manuscript.

      "Our model validation is notably weakest in predicting experiments on polyploidization, indicating a need to better characterize polyploidy and cytokinesis pathways. Because such data are limited in cardiomyocytes, we performed an additional validation against polyploidization experiments from other cell types as summarized in Pandit et al. Our CM proliferation model predicted 85% (6 of 7) experiments. Future experiments are needed to identify conserved or differential mechanisms of polyploidization and cytokinesis in cardiomyocytes." Line 587-594

      • Paragraph on the cytokinesis module (lines 364-377) is confusing - not sure what the takeaway message is. Also, while progression through G1/S and G2/M are "required" for cytokinesis they on their own are not sufficient (lines 366-368), this perhaps goes back to major comment 1. We agree this sentence was confusing, it was meant to be introductory rather than stating a particular result. We removed that sentence and further revised our description of the output module to clarify the model structure:

      "The output module interlinks the phenotypic outputs of the other modules, representing how experimentally measured aspects of cell cycle activity (DNA replication by EdU or Ki67), mitosis by phospho-Histone 3 (pHH3), abscission by cytokinetic midbody converge on polyploidy, binucleation, or cytokinesis (e.g. completed proliferation) (Figure 1G)." Line 283-286

      Minor comments:

      • Use of the word "Proliferation" should be reserved for situations where the authors can clearly say a new daughter cell was born. In many instances, "cell cycle activation" or "cell cycle progression" might be better terms. As suggested by the reviewer, we now use "cell cycle progression" in 7 instances, reserving "proliferation" for cell cycle progression through cytokinesis. In the remaining 90 instances, we refer to proliferation based on the model's predictions of completed cell division based on the combined DNA replication, mitosis, and cytokinesis pathways in the "output module". We retain "proliferation" in the title because the model encompasses the entire proliferation process from cell cycle entry through cytokinesis.

      • Supplementary Files 1 & 2 or Supplementary Document 2 were not provided or not found during review, thus we were unable to confirm which literature were used to build and validate the model. Thank you, we have included Supplementary Files 1 and 2 along with supplementary document 2 in the submission.

      • Figures are too small, particular Figure 1 We have enlarged Figure 1.

      • "E2F" should be specified as E2F1-3 yield quite distinct results from E2F7/8. We have changed "E2F" to "E2F123"

      • Text corresponding to Figure 5 does not reference most of the panels in the Figure. i.e. figures are not "cited" in the text We have made sure that each panel in Figure 5 is referenced in the text addressing the figure. We have also bolded all references to Figure 5.


      • Figure 5C - why is there no bars for PI3K. Text claims it was predicted by the model, but the data are missing? We apologize for the confusion regarding Figure 5C, in which the bar for PI3K was near zero. We now clarify this in the legend.

      "Predicted DNA replication and mitosis activity is close to zero when PI3K is inhibited alone and when PI3K is inhibited in combination with TT10 treatment."

      • Data provided in figure 5D & E are insufficient on their own to claim "proliferation". Perhaps adding total cardiomyocyte numbers, where one would expect expansion compared to control. We agree that Ki67 and pH3 are not sufficient to claim "proliferation", so we modified the Figure 5 legend to:

      "Prediction and experimental validation of cardiomyocyte cell cycle progression mediated by the Hippo pathway via PI3K, cMyc, and FoxM1."

      We previously found that cardiomyocyte numbers without live tracking are not sufficient to robustly measure proliferation (Woo et al, J Mol Cardiol, 2019).


      • Consider adding a details about the p-values to the figure legend in figure 5. Thank you for this suggestion p-value information has been added to the legend of Figure 5. We use *** Our literature-based validation in Figure 2 focused on 78 experiments that examined well-established and corroborated aspects of cardiomyocyte proliferation. Later in the paper, we focused on a newly predicted mechanism of cardiomyocyte proliferation involving small number of comparisons that would naturally have a lower a priori probability of validation in vitro neonatal experiments (Figure 5) and adult mouse experiments (Figure 6). Therefore, in the revised text we focus on the specific comparisons rather than statistics.

      "Based on predictions from this validated model, we hypothesized that YAP drove proliferation via PI3K, cMyc and FoxM1. To test this model-driven hypothesis, we accurately predicted TT10-induced DNA replication that is suppressed by inhibition of PI3K, cMyc and to a lesser extent FoxM1 (Figure 5D). These model predictions were further validated using RNA-seq and ATAC-seq data from adult mouse hearts showing that constitutively active YAPS5A induces expression of Myc and FoxM1 as well as increased chromatin accessibility at PI3Kca and Myc." Line 454-468

      In the discussion, we add:

      "Further model revision is needed based on these molecular mechanisms of YAP-TEAD-Myc interactions to distinguish between chromatin accessibility, transcription factor binding, and gene expression." Line 649-651

      As it stands now, the generated map largely constitutes already known details offering few if any new insights; however, if updated as new results arise AND made available as a public tool, the model could prove to be a highly valuable resource to the field.

      We thank the reviewer for recognizing our model as a valuable resource and public tool. We have made our model publicly available on GitHub at https://github.com/saucermanlab/Cardiomyocyte-Proliferation-Network.

      The virtual knockdown screens in Figure 3, 4 and 5 provide a wide range of new insights, which we clarify in new text.

      "Because this is a literature-based network model, each component or direct interaction has been studied individually. However, our model makes much broader predictions of how these components interact to regulate proliferation, beyond the ~30 papers available for validation on the response of this system to perturbations shown in Figure 3. For example, Supplemental Figure S3 provides ~5000 predictions of how each protein responds to knockdown of every other protein. These predictions led to new insights into how YAP regulates proliferation via cMyc (experimentally validated in Figure 5 in vitro and Figure 6 in vivo), as well as many other insights that can be validated in future studies. These future studies will be aided by the open-source availability of our model on GitHub." Line 563-571

      __ I have expertise in cardiomyocyte cell cycle and polyploidization.__


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The authors generated a computational model of cardiomyocyte proliferation, which predicts molecular drivers of cell cycle progression. Interestingly, the model correctly predicts the outcome of 95% independent experiments from the literature. The model also elucidated crosstalk between the growth factor and Hippo modules and the authors identified key hubs for which the Hippo signaling pathway regulates cardiomyocyte proliferation. The model provides a convenient systems framework to prioritize potential signaling drivers of therapeutic modulators of cardiomyocyte proliferation.

      Reviewer #3 (Significance (Required)):

      This is an interesting study reporting the generation of a computational model of cardiomyocyte proliferation, which predicts molecular drivers of cell cycle progression. The program may provide a convenient framework prioritizing potential signaling drivers of therapeutic modulators of cardiomyocyte proliferation. However, the overall impact of the study appears modest since it is unclear whether the study allows elucidation of the unique properties of cardiomyocyte proliferation in adult hearts (i.e. they hardly proliferate) and the validation study was conducted only in neonatal myocytes. The field has seen many studies with neonatal myocytes but the findings are not always translatable to adult cardiomyocytes.

      We thank the reviewer for recognizing the importance of our work that provides a framework for prioritizing potential signaling drivers of therapeutic modulators of CM proliferation.

      Neonatal studies are the most prevalent with cardiomyocyte proliferation literature, making it the most robust starting point that allows for rigorous validation. Based on the high performance of the model against neonatal data, in the future we expect this model to be a stepping stone towards adaptions to understand differences in the adult cardiomyocyte proliferation network. We have updated our model discussion on future directions on this point.

      "While our model's predictions are most relevant to immature cardiomyocytes, it is the first molecular network model of cardiomyocyte proliferation. In the future, this model will enable extensions to identify how factors like species, age and experimental design regulate proliferation. However, such endeavors would span multiple manuscripts and the field currently lacks sufficient stage-specific data. For example, the highly influential computational model of Luo and Rudy focused on adult guinea-pig cardiomyocyte electrophysiology (Luo and Rudy, Circ Res 1994). That model became the foundation for a wide range of development- and species-specific models in electrophysiology (Tusscher et al, AJP 2004,; Courtemanche eta al, AJP 1998; Paci et al, ABME 2013). We believe the open availability of our code will enable similar dissemination and extension for additional factors regulating cardiomyocyte proliferation." Line 655-665

      The authors described that "Literature articles used for model development came from multiple cell types due to limited CM data." It is unclear whether this would allow the identification of unique mechanisms present in cardiomyocytes. As the authors admitted, the fact that the model predictions and experimental observations for polyploidization did not match clearly suggests the complexity surrounding the possibility of cell phenotypes in cardiomyocyte populations. The authors could have addressed whether this model allows the identification of unique mechanisms mediating cardiomyocyte proliferation in the adult heart.

      Although we necessarily included literature on other cell types to support network reactions, all of the experimental validation in Figure 2 was with cardiomyocyte data (~33 publications). 80% of experiments were from neonatal CMs, 10% from adult CMs, 5% from in vivo studies, and the other 5% from hiPSC-derived cardiomyocytes as annotated in Supplemental File 3.

      At this time, there is insufficient data from which to make a model focused only on adult CMs. The mode's open-source availability enables future extensions that examine age and species-dependent mechanisms of cardiomyocyte proliferation. We updated the manuscript, addressing the ability of our model to adapt to new information.

      "This model provides an initial network framework for integrating additional discoveries in cardiomyocyte proliferation. As more information becomes available in cardiomyocyte proliferation literature the model can be adapted. Additionally, the field can use our open-sourced model to adapt this model to other developmental stages or species." Line 671-674

      Acknowledging the limited data on cardiomyocyte polyploidy, we performed a new separate validation of 7 experiments in non-myocytes from PMID 23849927, finding an 85.7% accuracy (new Supplementary Figure S7).

      Please provide more information regarding the rationale for having six modules in the authors' model, including the growth factor and the Hippo pathway.

      We revised the text to clarify the motivation for the six modules:

      "Our initial review of the literature indicated multiple complex molecular pathways that regulate cardiomyocyte proliferation, including growth factors, Hippo signaling, G1/S transition, G2/M transition, or cytokinesis pathways (Hashmi and Ahmad, PMID: 31205684; Payan et al, PMID: 30930108; Moral et al., PMID: 35008660; Wang et al., PMID: 30111784; Johnson et al., PMID: 34360531). Several review articles (Zheng et al, PMID: 32664346; Mia and Singh, PMID: 31632964; Diaz Del Moral et al, PMID: 35008660; Besson et al, PMID: 18267085; Wang et al, PMID: 19216791)) also organized the literature based on these distinct pathways or processes, which we used to define the boundaries of the six modules. However, how these molecular pathways work together is not well characterized. Therefore, we designed the model to incorporate each of these established modules and how they work together to drive cardiomyocyte proliferation." Line 550-557


      The extent of cardiomyocyte proliferation at baseline is very low in the adult heart. The model identified 25 nodes that may influence baseline proliferation. Is there any evidence to support the involvement of these mechanisms in baseline cardiomyocyte proliferation in vivo?

      We agree with the reviewer that proliferation at baseline is very low in the adult heart, and also rather low in neonatal cardiomyocytes. As shown in Figure S4A, we performed a virtual knockdown screen under baseline conditions that showed that no genetic knockdowns caused a substantial decrease in DNA replication or cytokinesis, consistent with a low baseline proliferation rate.

      We describe this point about baseline proliferation in revised text:

      "A complete virtual knockdown screen of the model was done under baseline conditions in Figure S4A, which showed that no knockdowns caused substantial decreases in DNA replication or cytokinesis. This is consistent with a low baseline proliferation rate described in cardiomyocyte literature." Line 354-357

      The validation study was conducted with neonatal rat ventricular cardiomyocytes. This study could have been repeated with adult cardiomyocytes since they are more resistant to proliferation and, thus, the Myc may not work as expected. In addition, the authors could have commented on the mechanism through which chromatin opening and YAP allow transcription of Myc in the heart.

      We agree that Myc is likely less proliferative in adult hearts. While our model was extensively validated against neonatal cardiomyocytes (Figure 2 for literature, Figure 5 for new neonatal experiments), only 10% of literature-based validations in Figure 2 are from adult cardiomyocytes due to limited data. However, in Figure 6 we validate YAP-dependent signaling to Myc, PI3K, and FOXM1 using RNA-seq and ATAC-seq data from Monroe et al. from adult mouse cardiomyocytes in vivo. While molecular mechanisms of YAP regulation of Myc are not characterized in the heart, based on the reviewer's suggestion, we add new discussion on YAP-Myc interaction in other cells:

      "Overexpression of Myc induces cardiomyocyte proliferation in vitro and in vivo in several contexts, with open chromatin and Myc binding near mitotic genes (PMID: 32286286). But to our knowledge, crosstalk of YAP with Myc has not been reported in the heart. Our model prediction and experiments in neonatal cardiomyocytes support a YAP-TEAD-Myc pathway for cardiomyocyte proliferation. Further, our analysis of ATAC-seq and RNA-seq data from Monroe et al. validate that YAP induces Myc chromatin availability and gene expression in adult mouse hearts.

      In MDA-MB-231 breast cancer cells, YAP/TAZ/TEAD bind directly to Myc enhancers through chromatin looping, with decreased acetylation of H3K27 and cell proliferation upon YAP/TAZ knockdown (26258633). YAP-TEAD-Myc signaling regulates the proliferation of cancer cells (26258633), tumorigenesis (29416644), and the growth of Drosophila imaginal discs (20951343). In the future, computational models and experiments are needed to better resolve how YAP promotes proliferation via Myc in the adult heart, including regulation by Mycn (30315164), cyclin T1 (32286286)."Line 632-644


    1. sera très souvent négatif

      insérer " $$\text{la différence }(g_{t}-\bar{g})$$" avant "sera très souvent négatif" (code Latex : (g_{t}-\bar{g}))

    1. super intelligence is going to be like this across many domains it's going to be 00:31:42 able to find exploits in human code too subtle for humans to notice and it's going to be able to generate code too complicated for any human to understand even if the model spent decades trying to explain it

      for - progress trap - superintelligence threat

      progress trap - superintelligence threat - super intelligence is going to be far beyond our cognitive capabilities across many domains. For example, - it's going to be able to find exploits in human code too subtle for humans to notice - it's going to be able to generate code too complicated for any human to understand - even if the model spent decades trying to explain it - How do we entrust ourselves to a superintelligence that is so far beyond us? If it thinks we are expendable, it could easily find our weaknesses and bring about extinction

    2. be able to quick Master any domain write trillions lines of code and read every research paper in every scientific field ever written

      for - AI evolution - projections for capabilities by 2030

      AI evolution - projections for 2030 - AI will be able to do things we cannot even conceive of now because their cognitive capabilities are orders of magnitudes faster than our own - Write billions of lines of code - Absorb every scientific paper ever written and write new ones - Gain the equivalent of billions of human equivalent years of experience

    3. having an automated AI research engineer by 2027 00:05:14 to 2028 is not something that is far far off

      for - progress trap - AI - milestone - automated AI researcher

      progress trap - AI - milestone - automated AI researcher - This is a serious concern that must be debated - An AI researcher that does research on itself has no moral compass and can encode undecipherable code into future generations of AI that provides no back door to AI if something goes wrong. - For instance, if AI reached the conclusion that humans need to be eliminated in order to save the biosphere, - it can disseminate its strategies covertly under secret communications with unbreakable code

    1. Résumé de la vidéo [00:00:00][^1^][1] - [00:34:20][^2^][2]:

      La vidéo présente une discussion sur la modification corporelle, en particulier le piercing et la scarification, et leur signification culturelle et personnelle. Les intervenants partagent leurs expériences et connaissances sur l'histoire et les pratiques actuelles de ces modifications, ainsi que sur leur impact sur l'identité et l'appartenance à un groupe.

      Points forts: + [00:00:15][^3^][3] Introduction et contexte * Présentation des intervenants et de leur expérience dans la modification corporelle * Discussion sur l'importance culturelle et personnelle des modifications + [00:05:28][^4^][4] Le piercing et la modification corporelle * Explication de ce qu'est le piercing et son évolution à travers les cultures * Exemples de piercings et de modifications dans différentes sociétés + [00:10:27][^5^][5] Matériaux et bijoux modernes * Présentation des matériaux biocompatibles utilisés dans les piercings modernes * Discussion sur les normes d'implantation et la sécurité des matériaux + [00:22:32][^6^][6] La scarification et son évolution * Exploration de la scarification traditionnelle et de ses motivations * Comparaison avec les pratiques contemporaines de scarification + [00:31:53][^7^][7] Motivations pour la modification corporelle aujourd'hui * Réflexion sur les raisons personnelles et esthétiques de la modification corporelle * Impact sur l'identité individuelle et l'appartenance à un groupe

      Résumé de la vidéo [00:00:00][^1^][1] - [00:34:20][^2^][2]:

      Cette vidéo présente une discussion sur la modification corporelle, en particulier le tatouage et le piercing, et leur signification culturelle et personnelle. Les intervenants partagent leurs expériences et perspectives sur l'évolution de ces pratiques et leur impact sur l'identité individuelle.

      Points forts: + [00:00:00][^3^][3] Introduction et contexte * Présentation des intervenants et de leur expérience dans le domaine de la modification corporelle * Discussion sur l'évolution du tatouage et du piercing + [00:10:55][^4^][4] Signification culturelle du piercing * Exploration des pratiques de piercing dans différentes cultures et leur signification rituelle ou sociale * Comparaison avec les tendances modernes du piercing + [00:22:30][^5^][5] La scarification et son histoire * Explication de la scarification et de ses différentes motivations, y compris esthétiques et rituelles * Exemples de scarification traditionnelle et contemporaine + [00:31:53][^6^][6] Motivations pour la modification corporelle aujourd'hui * Réflexion sur les raisons personnelles et esthétiques qui poussent les gens à se modifier corporellement * L'importance de l'identité et de l'appartenance à travers la modification corporelle

      Résumé de la vidéo [00:34:23][^1^][1] - [01:05:18][^2^][2]:

      La vidéo explore le rôle et l'impact des rituels corporels dans les cultures contemporaines, en particulier les modifications corporelles comme les piercings et les tatouages. Elle discute de la manière dont ces pratiques peuvent marquer un passage important dans la vie d'une personne, souvent associé à un état modifié de conscience ou à un sentiment d'accomplissement personnel.

      Points forts: + [00:34:23][^3^][3] Le sens des rituels * Importance des rituels dans le passage à un nouvel état * Impact psychologique et physique des rituels * La peur et le dépassement de soi comme éléments clés + [00:42:25][^4^][4] L'étude du stress lié aux rituels * Analyse des réactions physiologiques au stress rituel * Observation d'un retour à la sérénité après le rituel * Effet positif des rituels sur les participants et les spectateurs + [00:48:32][^5^][5] Histoire et signification du tatouage * Évolution du tatouage de l'antiquité à nos jours * Significations sociales, punitives et décoratives des tatouages * Influence des tatouages sur l'identité culturelle et personnelle + [00:57:00][^6^][6] Renaissance des pratiques de tatouage * Résurgence des tatouages comme expression de fierté culturelle * Efforts pour préserver les traditions de tatouage malgré la mondialisation * L'importance de l'éducation et de la préservation des significations traditionnelles

      Résumé de la vidéo [01:05:20][^1^][1] - [01:40:18][^2^][2] : La vidéo explore l'histoire et la signification culturelle des tatouages, en se concentrant sur leur évolution, leur symbolisme et leur acceptation sociale. Elle aborde les origines des tatouages, leur rôle dans diverses sociétés et cultures, et comment ils sont devenus un moyen d'expression personnelle dans le monde moderne.

      Points forts : + [01:05:20][^3^][3] Symbolisme et réaction culturelle * Discussion sur les symboles de révolte et leur adoption par la culture populaire * L'utilisation des tatouages comme forme de protestation et d'expression individuelle * Exemples de tatouages symboliques et leur signification dans différents contextes + [01:08:05][^4^][4] Histoire du tatouage chez les marins * L'influence des marins sur la propagation du tatouage à travers le monde * Le tatouage comme rituel d'initiation et marque d'appartenance à une communauté * Signification des motifs traditionnels et leur lien avec les voyages et les expériences vécues + [01:10:04][^5^][5] Tatouages dans les milieux criminels * Le langage codé des tatouages dans la mafia russe et leur signification complexe * Comment les tatouages racontent l'histoire personnelle et le parcours criminel d'un individu * Les conséquences des tatouages identifiables dans la résolution d'affaires criminelles + [01:12:22][^6^][6] Démocratisation et visibilité des tatouages * L'augmentation de la popularité des tatouages grâce à la visibilité dans les médias et la mode * L'impact des célébrités et des influenceurs sur la perception publique des tatouages * Statistiques sur la croissance du nombre de studios de tatouage et l'intérêt pour les conventions de tatouage + [01:18:01][^7^][7] Emplacement et visibilité des tatouages * La tendance croissante des tatouages visibles et leur acceptation sociale * La féminisation du tatouage et l'augmentation des femmes tatouées * L'importance des tatouages comme moyen d'expression personnelle et de confiance en soi + [01:25:08][^8^][8] Scarifications et thérapie * La discussion sur les scarifications comme moyen d'expression pour les jeunes en difficulté * Le potentiel thérapeutique des tatouages et leur impact sur la confiance en soi * La réglementation et la formation en hygiène et salubrité pour les professionnels du tatouage

      Résumé de la vidéo [01:40:21][^1^][1] - [01:54:04][^2^][2]:

      Cette vidéo discute des pratiques et des matériaux dans les studios de tatouage et de piercing, mettant l'accent sur l'importance de l'hygiène, du consentement éclairé et de l'utilisation de matériaux sûrs pour les clients.

      Points forts: + [01:40:21][^3^][3] L'évolution des studios de tatouage et piercing * Création en 2008 et progrès réalisés depuis * Importance de certaines pratiques et matériaux * Nécessité pour les clients de faire des recherches et de se fier à leur instinct + [01:42:22][^4^][4] Protocoles d'hygiène et utilisation de matériel médical * Utilisation de gants stériles et aiguilles pour prévenir les contaminations croisées * Suivi des bonnes pratiques dans les studios * Différences subtiles dans les protocoles selon l'éthique des salons + [01:44:53][^5^][5] Matériaux utilisés pour le piercing * Utilisation de matériaux de grade implantable comme le titane * Importance de la finition et de l'alliage des bijoux * Législation sur l'utilisation du nickel pour éviter les allergies + [01:49:20][^6^][6] Consentement et traçabilité * Obligation légale de consentement et de traçabilité du matériel utilisé * Évaluation de l'état général de la personne avant le piercing * Impact de la crise sanitaire sur les procédures et rendez-vous

    1. Reviewer #3 (Public Review):

      Summary:

      Li et al. describe an audiovisual temporal recalibration experiment in which participants perform baseline sessions of ternary order judgments about audiovisual stimulus pairs with various stimulus-onset asynchronies (SOAs). These are followed by adaptation at several adapting SOAs (each on a different day), followed by post-adaptation sessions to assess changes in psychometric functions. The key novelty is the formal specification and application/fit of a causal-inference model for the perception of relative timing, providing simulated predictions for the complete set of psychometric functions both pre and post-adaptation.

      Strengths:

      (1) Formal models are preferable to vague theoretical statements about a process, and prior to this work, certain accounts of temporal recalibration (specifically those that do not rely on a population code) had only qualitative theoretical statements to explain how/why the magnitude of recalibration changes non-linearly with the stimulus-onset asynchrony of the adaptor.

      (2) The experiment is appropriate, the methods are well described, and the average model prediction is a fairly good match to the average data (Figure 4). Conclusions may be overstated slightly, but seem to be essentially supported by the data and modelling.

      (3) The work should be impactful. There seems a good chance that this will become the go-to modelling framework for those exploring non-population-code accounts of temporal recalibration (or comparing them with population-code accounts).

      (4) A key issue for the generality of the model, specifically in terms of recalibration asymmetries reported by other authors that are inconsistent with those reported here, is properly acknowledged in the discussion.

      Weaknesses:

      (1) The evidence for the model comes in two forms. First, two trends in the data (non-linearity and asymmetry) are illustrated, and the model is shown to be capable of delivering patterns like these. Second, the model is compared, via AIC, to three other models. However, the main comparison models are clearly not going to fit the data very well, so the fact that the new model fits better does not seem all that compelling. I would suggest that the authors consider a comparison with the atheoretical model they use to first illustrate the data (in Figure 2). This model fits all sessions but with complete freedom to move the bias around (whereas the new model constrains the way bias changes via a principled account). The atheoretical model will obviously fit better, but will have many more free parameters, so a comparison via AIC/BIC or similar should be informative.

      (2) It does not appear that some key comparisons have been subjected to appropriate inferential statistical tests. Specifically, lines 196-207 - presumably this is the mean (and SD or SE) change in AIC between models across the group of 9 observers. So are these differences actually significant, for example via t-test?

      (3) The manuscript tends to gloss over the population-code account of temporal recalibration, which can already provide a quantitative account of how the magnitude of recalibration varies with adaptor SOA. This could be better acknowledged, and the features a population code may struggle with (asymmetry?) are considered.

      (4) The engagement with relevant past literature seems a little thin. Firstly, papers that have applied causal inference modelling to judgments of relative timing are overlooked (see references below). There should be greater clarity regarding how the modelling here builds on or differs from these previous papers (most obviously in terms of additionally modelling the recalibration process, but other details may vary too). Secondly, there is no discussion of previous findings like that in Fujisaki et al.'s seminal work on recalibration, where the spatial overlap of the audio and visual events didn't seem to matter (although admittedly this was an N = 2 control experiment). This kind of finding would seem relevant to a causal inference account.

      References:<br /> Magnotti JF, Ma WJ and Beauchamp MS (2013) Causal inference of asynchronous audiovisual speech. Front. Psychol. 4:798. doi: 10.3389/fpsyg.2013.00798<br /> Sato, Y. (2021). Comparing Bayesian models for simultaneity judgement with different causal assumptions. J. Math. Psychol., 102, 102521.

      (5) As a minor point, the model relies on simulation, which may limit its take-up/application by others in the field.

      (6) There is little in the way of reassurance regarding the model's identifiability and recoverability. The authors might for example consider some parameter recovery simulations or similar.

      (7) I don't recall any statements about open science and the availability of code and data.

    1. Reviewer #3 (Public Review):

      Summary:

      The burst fraction neural code has conceptual interest but has been little examined in vivo. This study examines and compares the burst fraction, the standard firing rate (firing rate) code, and the related event fraction (event rate) code using published data from an experiment where rats learned to lick after detecting electrical microstimulation in the somatosensory (barrel) cortex. Analyzing single-neuron spiking responses, the study reports that the burst fraction identifies more and different neurons showing the effects of training than the firing rate. The study further claims that the burst fraction (1) most promptly responded to false-negative detection errors, (2) during further training of trained animals (from 80% to 90% accuracy, over five days), correlates with behavioral accuracy, and (3) by shifting earlier to align with the (relatively constant) event rate modulation, leads to the observed sharpened firing rate response during this further training. The study concludes that 'a fine-grained separation of spike timing patterns [into burst fraction, firing rate, and event rate] reveals two signals,' an error signal and a sharpening signal.

      Strengths:

      The burst fraction is shown to discern more (and somewhat different) cells showing significant responses in trained animals and also to reveal a larger absolute difference in the fraction of responsive cells between naïve and trained animals. The Poisson model analysis particularly convincingly shows that the firing rate alone cannot explain either the spiking pattern or the prevalence of burst fraction-ON cells, thereby furnishing strong evidence that the burst fraction conveys independent information from the firing rate. The demonstration of error signals on miss trials in all three neural codes (burst fraction, firing rate, event rate) is interesting. It is also interesting to see that neural responses broadly shift earlier for animals even during further training in an already 'expert' stage and that the burst fraction correlates with further accuracy increases.

      Weaknesses:

      The evidence is inadequate for the burst fraction as responding more promptly to missed trials.

      This key claim seems to rest solely on the timing of the first bins in Figure 3B showing statistically significant differences. This reasoning implicitly draws inferences from the lack of statistical differences, which cannot support a positive claim in general. Specifically, here, the burst fraction is calculated with a division, which can magnify small differences and impact the power of statistical tests. If I trace back from the first bin showing significant differences to the first bin the signal starts rising, the timing seems to be comparable for all three neural codes (~1.6 s).

      Pertinently, what is the statistical test used in Figure 3B? A parametric test may be inappropriate for the burst fraction, a ratio that like does not fulfill the normality assumption. An inappropriate test would compound the problem of concluding from the lack of (early) significant differences.

      The evidence that burst fraction is responsible for sharpening is opaque due to insufficient statistical reporting. Specifically, it seems there is a correlation between firing rate and accuracy that is reported as non-significant.

      Changes in the reaction times (or other movement parameters) over-training may confound the correlation of the burst fraction to the accuracy and firing rate sharpening during further training. Lack of control for changes in movement over training weakens the results.

      The claim of independence of burst fraction and event rate/firing rate information is too strong. The authors show a significant negative correlation between burst fraction and firing rate (2D).

      The claim that there is no 'functional reorganization' beyond day two is too strong. Although this claim is not a core one to the study, it derives from an absence of statistical significance, especially problematic here as the effect sizes are large. For example, the Spearman correlation is 0.67/0.87 for the analyses with burst fraction. With only five data points, even strong effects may not achieve statistical significance, making negative conclusions problematic. Further, how are the p-values calculated (if using a parametric test, are the assumptions met), and why should these analyses use Spearman's correlation when analogous analyses in Figure 4E, F use Pearson's r?

      Does the burst fraction correlate with accuracy in cross-training?

      If the burst fraction correlates with accuracy, it should be expected to do so also when the animals progress from the naïve to the trained stage. Moreover, the correlation in Figure 4E can benefit from strengthening as it is now supported by only five points, is driven by only three 'clusters,' and only represents a narrow range of accuracies. If the data is available for this analysis, it should be done to test and potentially strengthen the main claim of the study.

      The text and figures contain numerous ambiguities that need to be clarified. These do not include obvious typos, only elements that affect conceptual understanding.

      - Some key terms in the main claims are never defined. For example, in the title, it is unclear what 'fast' and 'transients' mean. The abstract uses, but the main text never defines, 'demultiplexing,' 'a *conjunctive* burst code,' 'sparse and succinct [sic],' and 'correlated more *globally*.'

      - Some paper components are un(der)explained and, sometimes, apparently internally inconsistent. For example, in Figure 1I, the fraction of firing rate-ON cells does not look like the 6% shown in Figure 1J, left. In Figure 2E-G, what is the total cell number, 279, in Figure 2G legend, why is it different from the 153 total cells in Figure 2E legend, and what is the 'n = 5' within Figure 2G? All n numbers should be explained in general; more examples include the 245 in Figure 3C and the 49 in Figure 3B. In Figure 3C, what is the top horizontal bar (I assume significant differences)? About catch trials, the Figure 3D legend says rewards are given on licks, but the text says licking was not rewarded; which is the case? Figure 4B legend says 'firing rate (left), burst fraction (middle) and event rate (right),' but the plot colors imply a different order.

      - The abstract states, 'The alignment of bursting and event rate modulation [...] was strongly associated [sic] behavioral accuracy.' It seems to me it is not the alignment of burst fraction and event rate but rather burst fraction per se that correlates with behavioral accuracy (Figure 4E right). At least, the latter correlation is the only one tested.

    1. le dialogue social 00:26:44 n'occulte pas non plus les élèves les syndicats d'élèves sont plutôt des associations qui peuvent se constituer librement mais qui doivent être autorisés par le chef par le chef d'établissement et le conseil d'administration pour pouvoir exercer 00:26:57 leur activité au sein des lycées j'en revois à l'article R 511-9 du code d'éducation la liberté de réunion des élèves est prévue et encadrée aux articles 00:27:11 l511-2 et r51-10 du code d'éducation ainsi que leur liberté d'expression qui est consacrée elle à l'article R 511-8 si le chef d'établissement doit 00:27:24 permettre aux associations d'élèves de jouir de leurs droits et de leur donner quelques é là encore boîte au lettres panneau d'affichage il doit surtout savoir qu'il est garant du fait que l'objet comme l'activité de 00:27:36 l'association n'est ni politique ni religieux et doit être compatible avec les principes du service public de l'enseignement le tout dans le respect du code pénal il en va de l'ordre public 00:27:48 scolaire et par conséquent d'un dialogue social apaisé
    2. le chef d'établissement est également garant d'un dialogue social constructif avec les usagers d'une part les associations de parents d'élèves participent aux différentes instances collégiales des établissements publics 00:26:06 des établissement scolair et le code deéducation leur consacre une sous-section spéciale à l'article D 111-6 et suivant le code précise que les associations parents d'élèves doivent 00:26:18 avoir pour objet la défense des intérêts moraux et matériel commun aux parents d'élèves dans le cadre de leur mission les associations bénéficient d'un certain nombre de faité matérielle elles aussi et logistique que le chef 00:26:31 d'établissement doit permettre une boîte aux lettres des tableaux d'affichage et puis l'autorisation le cas échéant de réunion ponctuell peut-être parfois de du matériel informatique
    1. You can also use Bash in the Azure Cloud Shell to connect to your VM. You can use Cloud Shell in a web browser, from the Azure portal, or as a terminal in Visual Studio Code using the Azure Account extension. You can also install the Windows Subsystem for Linux to connect to your VM over SSH and use other native Linux tools within a Bash shell.

      the Azure portal streamlines the process so there is minimal overhead during deployment

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This study examines the spatial and temporal patterns of occurrence and the interspecific associations within a terrestrial mammalian community along human disturbance gradients. They conclude that human activity leads to a higher incidence of positive associations.

      Strengths:

      The theoretical framework of the study is brilliantly introduced. Solid data and sound methodology. This study is based on an extensive series of camera trap data. Good review of the literature on this topic.

      Weaknesses:

      The authors use the terms associations and interactions interchangeably.

      This is not the case. In fact, we state specifically that "... interspecific associations should not be directly interpreted as a signal of biotic interactions between pairs of species…" However, co-occurrence can be an important predictor of likely interactions, such as competition and predation. We stand by our original text.

      It is not clear what the authors mean by "associations". A brief clarification would be helpful.

      Our specific definition of what is meant here by spatial association can be found in the Methods section. To clarify, the calculation of the index of associations is based on the covariance for the two species of the residuals (epsilon) after consideration of all species-specific response to known environmental covariates. These covariances are modelled to allow them to vary with the level of human disturbance, measured as human presence and human modification. After normalization, the final index of association is a correlation value that varies between -1 (complete disassociation) and +1 (complete positive association).

      Also, the authors do not delve into the different types of association found in the study. A more ecological perspective explaining why certain species tend to exhibit negative associations and why others show the opposite pattern (and thus, can be used as indicator species) is missing.

      Suggesting the ecological underpinnings of the associations observed here would mainly be speculation at this point, but the associations demonstrated in this analysis do suggest promising areas for the more detailed research suggested.

      Also, the authors do not distinguish between significant (true) non-random associations and random associations. In my opinion, associations are those in which two species co-occur more or less than expected by chance. This is not well addressed in the present version of the manuscript.

      Results were considered to be non-random if correlation coefficients (for spatial association) or overlap (for temporal association) fell outside of 95% Confidence Intervals. This is now stated clearly in the Methods section.  In Figure 3—figure supplement 1-3 and Figure 4—figure supplement 1-3, p<0.01 levels are also presented.

      The obtained results support the conclusions of the study.

      Anthropogenic pressures can shape species associations by increasing spatial and temporal co-occurrence, but above a certain threshold, the positive influence of human activity in terms of species associations could be reverted. This study can stimulate further work in this direction.

      Reviewer #2 (Public Review):

      Summary:

      This study analyses camera trapping information on the occurrence of forest mammals along a gradient of human modification of the environment. The key hypotheses are that human disturbance squeezes wildlife into a smaller area or their activity into only part of the day, leading to increased co-occurrence under modification. The method used is joint species distribution modelling (JSDM).

      Strengths:

      The data source seems to be very nice, although since very little information is presented, this is hard to be sure of. Also, the JSDM approach is, in principle, a nice way of simultaneously analysing the data.

      Weaknesses:

      The manuscript suffers from a mismatch of hypotheses and methods at two different levels.

      (1) At the lower level, we first need to understand what the individual species do and "like" (their environmental niche). That information is not presented, and the methods suggest that the representation of each species in the JSDM is likely to be extremely poor.

      The response of each species to the environmental covariates provides a window into their environmental niche, encapsulated in the beta coefficients for each environmental covariate. This information is presented in Figure 2.

      (2) The hypothesis clearly asks for an analysis of the statistical interaction between human disturbance and co-occurrence. Yet, the model is not set up this way, and the authors thus do a lot of indirect exploration, rather than direct hypothesis testing.

      Our JSDM model is set up specifically to examine the effect of human disturbance on co-occurrence, after controlling for shared responses to environmental variables.  It directly tests the first hypothesis, since, if increase in indices of human disturbance had not tended to increase the measured spatial correlations between species as detected by the model, we would have rejected our stated hypothesis that human modification of habitats results in increased positive spatial associations between species.

      Even when the focus is not the individual species, but rather their association, we need to formulate what the expectation is. The hypotheses point towards presenting the spatial and the temporal niche, and how it changes, species for species, under human disturbance. To this, one can then add the layer of interspecific associations.

      Examining each species one by one and how each one responds to human disturbance would miss the effects of any meaningful interactions between species.  The analysis presented provides a means to highlight associations that would have been overlooked.  Future research could go on to analyze the strongest associations in the community and the strongest effects of human disturbance so as to uncover the underlying interactions that give rise to them and the mechanisms of human impact.  We believe that this will prove to be a much more productive approach than trying to tackle this problem species by species and pair by pair.

      The change in activity and space use can be analysed much simpler, by looking at the activity times and spatial distribution directly. It remains unclear what the contribution of the JSDM is, unless it is able to represent this activity and spatial information, and put it in a testable interaction with human disturbance.

      The topic is actually rather complicated. If biotic interactions change along the disturbance gradient, then observed data are already the outcome of such changed interactions. We thus cannot use the data to infer them! But we can show, for each species, that the habitat preferences change along the disturbance gradient - or not, as the case may be.

      Then, in the next step, one would have to formulate specific hypotheses about which species are likely to change their associations more, and which less (based e.g. on predator-prey or competitive interactions). The data and analyses presented do not answer any of these issues.

      We suggest that the so-called “simpler” approach described above is anything but simple, and this is precisely what the Joint Species Distribution Model improves upon.  As pointed out in the Introduction, simply examining spatial overlap is not enough to detect a signal of meaningful biotic interaction, since overlap could be the result of similar responses to environmental variables.  With the JSDM approach, this would not be considered a positive association and would then not imply the possible existence of meaningful interaction.

      Another more substantial point is that, according to my understanding of the methods, the per-species models are very inappropriate: the predictors are only linear, and there are no statistical interactions (L374). There is no conceivable species in the world whose niche would be described by such an oversimplified model.

      While interaction terms can be included in the JSDM, this would considerably increase the complexity of the models.  In previous work, we have found no strong evidence for the importance of interaction terms and they do not improve the performance of the models.

      We have no idea of even the most basic characteristics of the per-species models: prevalences, coefficient estimates, D2 of the model, and analysis of the temporal and spatial autocorrelation of the residuals, although they form the basis for the association analysis!

      The coefficient estimates for response to environmental variables used in the JSDM are provided in Figure 2 and Figure 2—source data 1.

      Why are times of day and day of the year not included as predictors IN INTERACTION with niche predictors and human disturbance, since they represent the temporal dimension on which niches are hypothesised to change?

      Also, all correlations among species should be shown for the raw data and for the model residuals: how much does that actually change and can thus be explained by the niche models?

      The discussion has little to add to the results. The complexity of the challenge (understanding a community-level response after accounting for species-level responses) is not met, and instead substantial room is given to general statements of how important this line of research is. I failed to see any advance in ecological understanding at the community level.

      We agree that the community-level response to human disturbance is a complex topic, and we believe it is also a very important one.  This research and its support of the spatial compression hypothesis, while not providing definitive answers to detailed mechanisms, opens up new lines of inquiry that makes it an important advance.  For example, the strong effects of human disturbance on certain associations that were detected here could now be examined with the kind of detailed species by species and pair by pair analysis that this reviewer appears to demand.

      Reviewer #1 (Recommendations For The Authors):

      L27 indicates instead of "idicates".

      We thank the reviewer for catching that error.

      L64 I would refer to potential interactions or just associations. It is always hard to provide evidence for the existence of true interactions.

      We have revised to “potential interactions” to qualify this statement.

      L69 Suggestion: distort instead of upset.

      We thank the reviewer for catching that error.

      L70-71 Here, authors use the term associations. Please, be consistent with the terminology throughout the manuscript.

      We thank the reviewer for raising this important point.  The term “co-occurrence” appears to be used inconsistently in the literature, so we have tried to refer to it only when referencing the work of us. For us, co-occurrence means “spatial overlap” without qualification as to whether it is caused by interaction or simply by similar responses to environmental factors (see Blanchet et al. 2020, Argument 1). In our view, interactions refer to biotic effects like predation, competition, commensalism, etc., while associations are the statistical footprint of these processes.   In keeping with this understanding, in Line 73, we changed "association" to the stronger word "interaction," but in Line 76, we keep the words "spatiotemporal association", which is presumed to be the result of those interactions. In Line 91, we have changed “interactions” to “associations,” as we do not believe interactions were demonstrated in that study. 

      L76 "Species associations are not necessarily fixed as positive or negative..." This sentence is misleading. I would say that species associations can vary across time and space, for instance along an environmental gradient.

      We thank the reviewer for pointing out the potential for confusion.  In Line 79, we have changed as suggested.

      L78 "Associations between free-ranging species are especially context-dependent" Loose sentence. Please, explain a bit further.

      We have changed the sentence to be more specific; ”Interactions are known to be context-dependent; for example, gradients in stress are associated with variation in the outcomes of pairwise species interactions.”

      L83-85 This would be a good place to introduce the 'stress gradient' hypothesis, which has also been applied to faunal communities in a few studies. According to this hypothesis, the incidence of positive associations should increase as environmental conditions harden.

      In our review of the literature, we find that the stress gradient hypothesis is somewhat controversial and does not receive strong support in vertebrates.  We have added the phrase “…the controversial stress-gradient hypothesis predicts that positive associations should increase as environmental conditions become more severe…”

      L86-88 Well, overall, the number of studies examining spatiotemporal associations in vertebrates is relatively small. That is, bird associations have not received much more attention than those of mammals. I find this introductory/appealing paragraph a bit rough. I think the authors can do better and find a better justification for their work.

      We thank the reviewer for the comments.  We have rewritten the paragraph extensively to make it clearer and to provide a stronger justification for the study.

      L106 "[...] resulting in increased positive spatial associations between species" I'd say that habitat shrinking would increase the level of species clustering or co-occurrence, but in my opinion, not necessarily the incidence of positive associations. It is not clear to me if the authors use positive associations as a term analogous to co-occurrence.

      We thank the reviewer for raising this very important distinction.  Habitat shrinking would increase levels of species co-occurrence, but this is not particularly interested.  We wanted to test whether there were effects on species interactions, as revealed by associations.  We find that the terms association and co-occurrence are used somewhat loosely in the literature and so have made some new effort to clarify and systematize this in the manuscript.  For example, there appear to be a differences in the way “co-occurrence” is used in Boron 2023 and in Blanchet 2020. We do not use the term "positive spatial association" as analogous to "spatial co-occurrence.". Spatial co-occurrence, which for us has the meaning of spatial overlap, could simply be the result of similar reactions to environmental co-variates, not reflecting any biotic interaction. Joint Species Distribution Models enable the partitioning of spatial overlap and segregation into that which can be explained by responses to known environmental factors, and that which cannot be explained and thus might be the result of biotic interactions.  It is only the latter that we are calling spatial association, which can be positive or negative.   These associations may be the statistical footprint of biotic interactions.

      Results:

      Difference between random and non-random association patterns. It is not clear to me if the reported associations are significant or not. The authors only report the sign of the association (either positive or negative) but do not clarify if these associations indicate that two species coexist more or less than expected by chance. In my opinion, that is the difference between true ecological associations (e.g., via facilitation or competition effects) and random co-existence patterns. This is paramount and should be addressed in a new version of the manuscript.

      This information is provided in Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3.  This is referenced in the text as follows, “… correlation coefficients for 18 species pairs were positive and had a 95 % CI that did not overlap zero, and the number increased to 65 in moderate modifications but dropped to 29 at higher modifications" and so on. This criterion for significance (ie., greater than expected by chance) is now stated at the end of the Materials and methods.  In Figure 3—figure supplement 1,2,3 and Figure 4—figure supplement 1,2,3, those correlations that were significant at p<0.01 are also shown.

      I am also missing a more ecological explanation for the observed findings. For instance, the top-ranked species in terms of negative associations is the red fox, whereas the muntjac seems to be the species whose presence can be used as an indicator for that of other species. What are the mechanisms underlying these patterns? Do red foxes compete for food with other species? Do the species that show positive associations (red goral, muntjac) have traits or a diet that are more different from those of other species? More discussion on these aspects (role of traits and the trophic niche) would be necessary to better understand the obtained results.

      The purpose of this paper was to test the compression hypotheses, and we have tried to keep that as the focus.  However, the analysis does open up interesting lines of inquiry for future research to decipher the details of the interactions between species and the mechanisms by which human disturbance facilitates or disrupts these interactions. The reviewer raises some interesting possibilities, but at this point, any discussion along these lines would be largely speculation and could lengthen the paper without great benefit. 

      Reviewer #2 (Recommendations For The Authors):

      The manuscript should be accompanied by all data and code of analysis.

      All data and RScripts have been made available in Science Data Bank: https://doi.org/10.57760/sciencedb.11804.

      The sentence "not much is known" is weak: it suggests the authors did not bother to quantify what IS known, and simply waved any previous knowledge aside. Surely we have some ideas about who preys on whom, and which species have overlapping resource requirements (e.g., due to jaw width). For those, we would expect a particularly strong signal, if the association is indeed indicative of interactions.

      We believe that the reviewer is referring to the statement in Line 90-92 about the lack of understanding of the resilience of terrestrial mammal associations to human disturbance.  We have added a reference to one very recent publication that addresses the issue (Boron et al., 2023), but otherwise we stand by our statement. We have, however, added a qualifier to make it clear that we did indeed look for previous knowledge; "However, a review of the literature indicates that ...."

      Figures:

      Fig. 1. This reviewer considers that this is too trivial and should be deleted.

      This is a graphical statement of the hypotheses and may be helpful to some readers.

      Fig. 2. Using points with error bars hides any potential information.

      Done as suggested.

      That only 4 predictors are presented is unacceptably oversimplified.

      Only 4 predictors are included because, in previous work, we found that adding additional predictors or interactions did little to improve the model’s performance (Li et al. 2018, 2021 and 2022) and could lead to over-fitting.

      Fig. 5. and 6. aggregate extremely strongly over species; it remains unclear which species contribute to the signal, and I guess most do not.

      The number of detection events presented in Table 1 should help to clarify the relative contribution of each species to the data presented in Figures 5 and 6.

      This reviewer considers that the introduction 'oversells' the paper.

      L55: can you give any such "unique ecological information"

      L60: Lyons et al. (Kathleen is the first name) has been challenged by Telford et al. (2016 Nature) as methodologically flawed.

      The first name has been deleted.  The methodological flaw has to do with interpretation of the fossil record and choice of samples, not with the need to partition shared environmental preferences and interactions.

      L61 contradicts line 64: Blanchet et al. (2022, specifying some arguments from Dormann et al. 2018 GEB) correctly point out that logically one cannot infer the existence or strength from co-occurrence data. It is thus wrong to then claim (citing Boron et al.) that such data "convey key information about interactions". The latter statement is incorrect. A tree and a beetle can have extremely high association and nothing to do with each other. Association does not mean anything in itself. When two species are spatially and temporally non-overlapping, they can exhibit perfect "anti-association", yet, by the authors' own definition, cannot interact.

      We believe that the reviewer’s concerns arise from a misunderstanding of how we use the term association.  In our usage, an association is not the same as co-occurrence or overlap, which may simply be the result of shared responses to environmental variables.  The co-occurring tree and beetle would not be found to have any association in our analysis, only shared environmental sensitivities.  In contrast, associations can be the statistical footprint of interactions, and would be overlaid onto any overlap due to similar responses to the environment.  In the case of negative associations, such as might be the result of competitive exclusion or avoidance of predators, the two species would share environmental responses but show lower than expected spatial overlap.  Even though they might be only rarely found in the same vicinity, they would indeed be interacting when they were together.

      Joint Species Distribution Models "allow the partitioning of the observed correlation into that which can be explained by species responses to environmental factors... and that which remains unexplained after controlling for environmental effects and which may reflect biotic interactions." (Garcia Navas et al. 2021). It is the latter that we are calling “associations.”

      L63: Gilbert reference: Good to have a reference for this statement.

      This point is important, but the reviewer’s comments below have made it clear that it is even more important to point out that strong interactions should be expected to lead to significant associations.  We have added a statement to clarify this.

      L70-72: Incorrect, interactions play a role, not associations (which are merely statistical).

      In this, we agree, and we have revised the statement to refer to interactions, not associations. In our view, an interaction is a biological phenomenon, while an association is the resulting statistical signal that we can detect.

      L75: Associations tell us nothing, only interactions do. Since these can not be reliably inferred, this statement and this claim are wrong.

      We thank the reviewer for raising this point, but we beg to disagree. Strong interactions should be expected to lead to significant associations that can be detected in the data. Associations, which can be measured reliably, are the evidence of potential interactions, and hence associations can tell us a great deal.  We have added a note to this effect after the Gilbert reference above to clarify this point.

      However, we do accept that associations must be interpreted with caution. As Blanchet et al. 2020 explain, " …the co-occurrence signals (e.g. a significant positive or negative correlation value) estimated from these models could originate from any abiotic factors that impact species differently. Therefore, this correlation cannot be systematically interpreted as a signal of biotic interactions, as it could instead express potential non-measured environmental drivers (or combinations of them) that influence species distribution and co-distribution.”  Or alternatively an association could be the result of interaction with a 3rd species. 

      L87: Regarding your claim, how would you know you DO understand? For that, you need to formulate an expectation before looking at the data and then show you cannot show what you actually measure. (Jaynes called this the "mind-projection fallacy".)

      We are not sure if the reviewer is criticizing our paper or the entire field of community ecology.  Perhaps it is the statement that “….resilience of interspecific spatiotemporal associations of terrestrial mammals to human activity remains poorly understood….”  Since we are confident that the reviewer believes that mammals do interact, we guess that it is the term “association” that is questioned.  We have revised this to “…the impacts of human activity on interspecific interactions of terrestrial mammals remains poorly understood…” 

      In this particular case, we did formulate an expectation before looking at the data, in the form of the two formal hypotheses that are clearly stated in the Introduction and illustrated in Figure 1. If the hypotheses had not been supported, then we would have accepted that we do not understand. But as the data are consistent with the hypotheses, we submit that we do understand a bit more now.

    1. Résumé de la vidéo [00:00:01][^1^][1] - [00:20:21][^2^][2]:

      Cette vidéo présente une discussion entre Adam Grant et Dan Ariely, économiste comportemental, sur la vérité honnête concernant la malhonnêteté. Ils explorent la fréquence de la malhonnêteté dans les organisations, l'impact des petits tricheurs par rapport aux grands, et comment les comportements négatifs peuvent devenir acceptables dans un environnement organisationnel. Ils discutent également du rôle de la direction, des lanceurs d'alerte et de l'importance d'un code de conduite clair pour prévenir la malhonnêteté.

      Points forts: + [00:00:01][^3^][3] La malhonnêteté dans les organisations * Très commune, mais principalement de petits tricheurs * Les petits tricheurs ont un impact économique plus important que les grands + [00:03:45][^4^][4] Le rôle de la direction et des lanceurs d'alerte * La direction peut influencer le comportement organisationnel * Les lanceurs d'alerte sont souvent traités comme des outsiders + [00:06:20][^5^][5] La pente glissante de la malhonnêteté * Les grandes fraudes commencent souvent par de petits pas * Importance de reconnaître les premiers signes de comportement contraire à l'éthique + [00:13:01][^6^][6] L'importance d'un code de conduite clair * Un code de conduite précis aide les individus à distinguer le bien du mal * Les règles floues permettent une rationalisation plus facile de la malhonnêteté

    1. Ensuite, téléchargez le fichier de code script_p3c3.py de ce dossier, et exécutez-le dans votre éditeur. Prenez le temps de comprendre ce que chaque ligne fait, et n’hésitez pas à regarder les captures vidéo plusieurs fois si besoin.

      Attention, l'extraction des titres n'est plus fonctionnelle dans le programme p3c3 suite à un changement du code HTML. Premier problème : Il faut remplacer la balise "a", par la balise "div". Second problème : la commande string n'est pas fonctionnelle car il y a des \n dans le code, le string renvoie donc None. Il faut remplacer les fonctions string par get_text()

      Il serait également bienvenue de rajouter une petite explication sur la fonction \n présent dans beaucoup de code HTML qu'il faut supprimer lors de l'extraction web.

      Enfin je recommande de modifier la ligne correspondante comme suit : with open("data.csv", "w", newline="") as fichier :

      Le fait d'ajouter newline="", permet de supprimer la ligne automatiquement générée par l'écriture sur le fichier csv

    1. Author response:

      The following is the authors’ response to the original reviews.

      Your editorial guidance, reviews, and suggestions have led us to make substantial changes to our manuscript. While we detail point-by-point responses in typical fashion below, I wanted to outline, at a high level, what we’ve done.

      (1) Methods. Your suggestions led us to rethink our presentation of our methods, which are now described more cohesively in a new methods section in the main text.

      (2) Model Validation & Robustness. Reviewers suggested various validations and checks to ensure that our findings were not, for instance, the consequence of a particular choice of parameter. These can be found in the supplementary materials.

      (3) Data Cleaning & Inclusion/Exclusion. Finally, based on feedback, our new methods section fully describes the process by which we cleaned our original data, and on what grounds we included/excluded individual faculty records from analysis.

      eLife assessment

      Efforts to increase the representation of women in academia have focussed on efforts to recruit more women and to reduce the attrition of women. This study - which is based on analyses of data on more than 250,000 tenured and tenure-track faculty from the period 2011-2020, and the predictions of counterfactual models - shows that hiring more women has a bigger impact than reducing attrition. The study is an important contribution to work on gender representation in academia, and while the evidence in support of the findings is solid, the description of the methods used is in need of improvement.

      Reviewer #1 (Public Review):

      Summary and strengths

      This is an interesting paper that concludes that hiring more women will do more to improve the gender balance of (US) academia than improving the attrition rates of women (which are usually higher than men's). Other groups have reported similar findings but this study uses a larger than usual dataset that spans many fields and institutions, so it is a good contribution to the field.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Weaknesses

      The paper uses a mixture of mathematical models (basically Leslie matrices, though that term isn't mentioned here) parameterised using statistical models fitted to data. However, the description of the methods needs to be improved significantly. The author should consider citing Matrix Population Models by Caswell (Second Edition; 2006; OUP) as a general introduction to these methods, and consider citing some or all of the following as examples of similar studies performed with these models:

      Shaw and Stanton. 2012. Proc Roy Soc B 279:3736-3741

      Brower and James. 2020. PLOS One 15:e0226392

      James and Brower. 2022. Royal Society Open Science 9:220785 Lawrence and Chen. 2015.

      [http://128.97.186.17/index.php/pwp/article/view/PWP-CCPR-2015-008]

      Danell and Hjerm. 2013. Scientometrics 94:999-1006

      We have expanded the description of methods in a new methods section of the paper which we hope will address the reviewer’s concerns.

      We agree that our model of faculty hiring and attrition resembles Leslie matrices. In results section B, we now mention Leslie matrices and cite Matrix Population Models by Caswell, noting a few key differences between Leslie matrices and the model of hiring and attrition presented in this work. Most notably, in the hiring and attrition model presented, the number of new hires is not based on per-capita fertility constants. Instead, population sizes are predetermined fixed values for each year, precluding exponential population growth or decay towards 0 that is commonly observed in the asymptotic behavior of linear Leslie Matrix models.

      We have additionally revised the main text to cite the listed examples of similar studies (we had already cited James and Brower, 2022). We thank the reviewer for bringing these relevant works to our attention.

      The analysis also runs the risk of conflating the fraction of women in a field with gender diversity! In female-dominated fields (e.g. Nursing, Education) increasing the proportion of women in the field will lead to reduced gender diversity. This does not seem to be accounted for in the analysis. It would also be helpful to state the number of men and women in each of the 111 fields in the study.

      We have carefully examined the manuscript and revised the text to correctly differentiate between gender diversity and women’s representation.

      We have additionally added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Reviewer #2 (Public Review):

      Summary:

      This important study by LaBerge and co-authors seeks to understand the causal drivers of faculty gender demographics by quantifying the relative importance of faculty hiring and attrition across fields. They leverage historical data to describe past trends and develop models that project future scenarios that test the efficacy of targeted interventions. Overall, I found this study to be a compelling and important analysis of gendered hiring and attrition in US institutions, and one that has wide-reaching policy implications for the academy. The authors have also suggested a number of fruitful future avenues for research that will allow for additional clarity in understanding the gendered, racial, and socioeconomic disparities present in US hiring and attrition, and potential strategies for mitigating or eliminating these disparities.

      We thank the reviewer for their positive assessment of the contributions of our work.

      Strengths:

      In this study, LaBerge et al use data from over 268,000 tenured and tenure-track faculty from over 100 fields at more than 12,000 PhD-granting institutions in the US. The period they examine covers 2011-2020. Their analysis provides a large-scale overview of demographics across fields, a unique strength that allows the authors to find statistically significant effects for gendered attrition and hiring across broad areas (STEM, non-STEM, and topical domains).

      LaBerge et al. find gendered disparities in attrition-using both empirical data and their counterfactual model-that account for the loss of 1378 women faculty across all fields between 2011 and 2020. It is true that "this number is both a small portion of academia... and a staggering number of individual careers," as ." - as this loss of women faculty is comparable to losing more than 70 entire departments. I appreciate the authors' discussion about these losses-they note that each of these is likely unnecessary, as women often report feeling that they were pushed out of academic jobs.

      LaBerge et al. also find-by developing a number of model scenarios testing the impacts of hiring, attrition, or both-that hiring has a greater impact on women's representation in the majority of academic fields in spite of higher attrition rates for women faculty relative to men at every career stage. Unlike many other studies of historical trends in gender diversity, which have often been limited to institution-specific analyses, they provide an analysis that spans over 100 fields and includes nearly all US PhD-granting institutions. They are able to project the impacts of strategies focusing on hiring or retention using models that project the impact of altering attrition risk or hiring success for women. With this approach, they show that even relatively modest annual changes in hiring accumulate over time to help improve the diversity of a given field. They also demonstrate that, across the model scenarios they employ, changes to hiring drive the largest improvement in the long-term gender diversity of a field.

      Future work will hopefully - as the authors point out - include intersectional analyses to determine whether a disproportionate share of lost gender diversity is due to the loss of women of color from the professoriate. I appreciate the author's discussion of the racial demographics of women in the professoriate, and their note that "the majority of women faculty in the US are white" and thus that the patterns observed in this study are predominately driven by this demographic. I also highly appreciate their final note that "equal representation is not equivalent to equal or fair treatment," and that diversifying hiring without mitigating the underlying cause of inequity will continue to contribute to higher losses of women faculty.

      Weaknesses

      First, and perhaps most importantly, it would be beneficial to include a distinct methods section. While the authors have woven the methods into the results section, I found that I needed to dig to find the answers to my questions about methods. I would also have appreciated additional information within the main text on the source of the data, specifics about its collection, inclusion and exclusion criteria for the present study, and other information on how the final dataset was produced. This - and additional information as the authors and editor see fit - would be helpful to readers hoping to understand some of the nuance behind the collection, curation, and analysis of this important dataset.

      We have expanded upon the description of methods in a new methods section of the paper.

      We have also added a detailed description of the data cleaning steps taken to produce the dataset used in these analyses, including the inclusion/exclusion criteria applied. This detailed description is at the beginning of the methods section. This addition has substantially enhanced the transparency of our data cleaning methods, so we thank the reviewer for this suggestion.

      I would also encourage the authors to include a note about binary gender classifications in the discussion section. In particular, I encourage them to include an explicit acknowledgement that the trends assessed in the present study are focused solely on two binary genders - and do not include an analysis of nonbinary, genderqueer, or other "third gender" individuals. While this is likely because of the limitations of the dataset utilized, the focus of this study on binary genders means that it does not reflect the true diversity of gender identities represented within the professoriate.

      In a similar vein, additional context on how gender was assigned on the basis of names should be added to the methods section.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      I do think that some care might be warranted regarding the statement that "eliminating gendered attrition leads to only modest changes in field-level diversity" (Page 6). while I do not think that this is untrue, I do think that the model scenarios where hiring is "radical" and attrition is unchanged from present (equal representation of women and men among hires (ER) + observed attrition (OA)) shows that a sole focus on hiring dampens the gains that can otherwise be addressed via even modest interventions (see, e.g., gender-neutral attrition (GNA) + increasing representation of women among hires (IR)). I am curious as to why the authors did not include an additional scenario where hiring rates are equal and attrition is equalized (i.e., GNA + ER). The importance of including this additional model is highlighted in the discussion, where, on Page 7, the authors write: "In our forecasting analysis, we find that eliminating the gendered attrition gap, in isolation, would not substantially increase representation of women faculty in academia. Rather, progress towards gender parity depends far more heavily on increasing women's representation among new faculty hires, with the greatest change occurring if hiring is close to gender parity." I believe that this statement would be greatly strengthened if the authors can also include a comparison to a scenario where both hiring and attrition are addressed with "radical" interventions.

      Our rationale for omitting the GNA + ER scenario in the presented analysis is that we can reason about the outcomes of this scenario without the need for computation; if a field has equal inputs of women and men faculty (on average) and equal retention rates between women and men (on average), then, no matter the field’s initial age and gender distribution of faculty, the expected value for the percentage of women faculty after all of the prior faculty have retired (which may take 40+ years) is exactly 50%. We have updated the main text to discuss this point.

      Reviewer #3 (Public Review):

      This manuscript investigates the roles of faculty hiring and attrition in influencing gender representation in US academia. It uses a comprehensive dataset covering tenured and tenure-track faculty across various fields from 2011 to 2020. The study employs a counterfactual model to assess the impact of hypothetical gender-neutral attrition and projects future gender representation under different policy scenarios. The analysis reveals that hiring has a more significant impact on women's representation than attrition in most fields and highlights the need for sustained changes in hiring practices to achieve gender parity.

      Strengths:

      Overall, the manuscript offers significant contributions to understanding gender diversity in academia through its rigorous data analysis and innovative methodology.

      The methodology is robust, employing extensive data covering a wide range of academic fields and institutions.

      Weaknesses:

      The primary weakness of the study lies in its focus on US academia, which may limit the generalizability of its findings to other cultural and academic contexts.

      We agree that the U.S. focus of this study limits the generalizability of our findings. The findings that we present in this work will only generalize to other populations–whether it be to an alternate industry, e.g., tech workers, or to faculty in different countries–to the extent that these other populations share similar hiring patterns, retention patterns, and current demographic representation. We have added a discussion of this limitation to the manuscript.

      Additionally, the counterfactual model's reliance on specific assumptions about gender-neutral attrition could affect the accuracy of its projections.

      Our projection analysis is intended to illustrate the potential gender representation outcomes of several possible counterfactual scenarios, with each projection being conditioned on transparent and simple assumptions. In this way, the projection analysis is not intended to predict or forecast the future.

      To resolve this point for our readers, we now introduce our projections in the context of the related terms of prediction and forecast, noting that they have distinct meanings as terms of art: On one hand, prediction and forecasting involve anticipating a specific outcome based on available information and analysis, and typically rely on patterns, trends, or historical data to make educated guesses about what will happen. Projections are based on assumptions and are often presented in a panel of possible future scenarios. While predictions and forecasts aim for precision, projections (which we make in our analysis) are more generalized and may involve a range of potential outcomes.

      Additionally, the study assumes that whoever disappeared from the dataset is attrition in academia. While in reality, those attritions could be researchers who moved to another country or another institution that is not included in the AARC (Academic Analytics Research Centre) dataset.

      In our revision, we have elevated this important point, and clarified it in the context of the various ways in which we count hires and attritions. We now explicitly state that “We define faculty hiring and faculty attrition to include all cases in which faculty join or leave a field or domain within our dataset.” Then, we enumerate the number of situations that could be counted as hires and attritions, including the reviewer’s example of faculty who move to another country.

      Reviewer #1 (Recommendations For The Authors):

      Section B: The authors use an age structured Leslie matrix model (see Caswell for a good reference to these) to test the effect of making the attrition rates or hiring rates equal for men and women. My main concern here is the fitting techniques for the parameters. These are described (a little too!) briefly in section S1B. Some specific questions that are left hanging include:

      A 5th order polynomial is an interesting choice. Some statistical evidence as to why it was the best fit would be useful. What other candidate models were compared? What was the "best fit" judgement made with: AIC, r^2? What are the estimates for how good this fit is? How many data points were fitted to? Was it the best fit choice for all of the 111 fields for men and women?

      We use a logistic regression model for each field to infer faculty attrition probabilities across career ages and time, and we include the career age predictor up to its fifth power to capture the career-age correlations observed in Spoon et. al., Science Advances, 2023. For ease of reference, we reproduce the attrition risk curves in Fig S4.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement.

      This function shape starts low and ends high, and includes at least one local minimum, which indicates that career age should be odd-ordered in the model and at least order-3, but only including career age up to its 3rd order term tended to miss some of the overserved career-age/attrition correlations. We evaluated the fit using 5-fold cross validation with a Brier score loss metric, and among options of polynomials of degree 1, 3, 5, or 7, we found that 5th order performed well overall on average over all fields (even if it was not the best for every field), without overfitting in fields with fewer data. Example fits, reminiscent of the figure from Spoon et al, are now provided in Figs S4 and S5.

      While the model fit with fifth order terms may not be the best fit for all 111 fields (e.g., 7th order fits better in some cases), we wanted to avoid field-specific curves that might be overfitted to the field-specific data, especially due to low sample size (and thus larger fluctuations) on the high career age side of the function. Our main text and supplement now includes justifications for our choice to include career age up to its fifth order terms.

      You used the 5th order logistic regression (bottom of page 11) to model attrition at different ages. The data in [24] shows that attrition increases sharply, then drops then increases again with career age. A fifth order polynomial on its own could plausibly do this but I associate logistic regression models like this as being monotonically increasing (or decreasing!), again more details as to how this worked would be useful.

      Our first submission did not explain this point well, but we hope that Supplementary Figures S4 and S5 provide clarity. In short, we agree of course that typical logistic regression assumes a linear relationship between the predictor variables and the log odds of the outcome variable. This means that the relationship between the predictor variables and the probability of the outcome variable follows a sigmoidal (S-shaped) curve. However, the relationship between the predictor variables and the outcome variable may not be linear.

      To capture more complex relationships, like the increasing, decreasing and then increasing attrition rates as a function of career age, higher-order terms can be added to the logistic regression model. These higher-order terms allow the model to capture nonlinear relationships between the predictor variables and the outcome variable — namely the non-monotonic relationship between rates of attrition and career age — while staying within a logistic regression framework.

      "The career age of new hires follows the average career age distribution of hires" did you use the empirical distribution here or did you fit a standard statistical distribution e.g. Gamma?

      We used the empirical distribution. This information has been added to the updated methods section in the main text.

      How did you account for institution (presumably available)? Your own work has shown that institution types plays a role which could be contributing to these results.

      See below.

      What other confounding variables could be at play here, what is available as part of the data and what happens if you do/don't account for them?

      A number of variables included in our data have been shown to correlate with faculty attrition, including PhD prestige, current institution prestige, PhD country, and whether or not an individual is a “self-hire,” i.e., trained and hired at the same institution (Wapman et. al., Nature, 2022). Additional factors that faculty self-report as reasons for leaving academia include issues of work-life balance, workplace climate, and professional reasons, and in some cases to varying degrees between men and women faculty (Spoon et. al., Sci. Adv., 2023).

      Our counterfactual analysis aims to address a specific question: how would women’s representation among faculty be different today if men and women were subjected to the same attrition patterns over the past decade? To answer this question, it is important to account for faculty career age, which we accept as a variable that will always correlate strongly with faculty attrition rates, as long as the tenure filter remains in place and faculty continue to naturally progress towards retirement age. On the other hand, it is less clear why PhD country, self-hire status, or any of the other mentioned variables should necessarily correlate with attrition rates and with gendered differences in attrition rates more specifically. While some or all of these variables may underlie the causal roots of gendered attrition rates, our analysis does not seek to answer causal questions about why faculty leave their jobs (e.g., by testing the impact of accounting for these variables in simulations per the reviewers suggestion). This is because we do not believe the data used in this analysis is sufficient to answer such questions, lacking comprehensive data on faculty stress (Spoon et. al., Sci. Adv., 2023), parenthood status, etc.

      What career age range did the model use?

      The career age range observed in model outcomes are a function of the empirically derived attrition rates for faculty across academic fields. The highest career age observed in the AARC data was 80, and the faculty career ages that result from our model simulations and projections do not exceed 80.

      We have also added the distribution of faculty across career ages for the projection scenario model outputs in the supplemental materials Fig. S3 (see response to your later comment regarding career age for further details). Looking at these distributions, it is observed that very few faculty have career age > 60, both in observation and in our simulations.

      What was the initial condition for the model?

      Empirical 2011 Faculty rosters are used as the initial conditions for the counterfactual analysis, and 2020 faculty rosters are these as the initial conditions for the projections analysis. This information has been added to the descriptions of methods in the main text.

      Starting the model in 2011 how well does it fit the available data up to 2020?

      Thank you for this suggestion. We ran this analysis for each field starting in 2011, and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields. This finding is not surprising, because the model is fit to the observed data, but it serves to validate the methods that we used to extract the model's parameters. We have added these results to the supplement (Fig. S2).

      What are the sensitivity analysis results for the model? If you have made different fitting decisions how much would the results change? All this applied to both the hiring and attrition parameters estimates.

      We model attrition and hiring using logistic regression, with career age included as an exogenous variable up to its fifth power. A natural question follows: what if we used a model with career age only to its first or third power? Or to higher powers? We performed this sensitivity analysis, and added three new figures to the supplement to present these findings:

      First, we show the observed attrition probabilities at each career age, and four model fits to attrition data (Supplementary Figs S4 and S5). The first model includes career age only to its first power, and this model clearly does not capture the full career age / attrition correlation structure. The second model includes career age to its third power, which does a better job of fitting to the observed patterns. The third model includes career age up to its fifth power, which appears to very modestly improve upon the former model. The fourth model includes career age up to its seventh power, and the patterns captured by this model are largely the same as the 5th-power model up to career age 50, beyond which there are some notable differences in the inferred attrition probabilities. These differences would have relatively little impact on model outcomes because the vast majority of faculty have a career age below 50.

      Second, we show the observed probability that hires are women, conditional on the career age of the hire. Once again, we fit four models to the data, and find that career age should be included at least up to its fifth order in order to capture the correlation structures between career age and the gender of new hires. However, limited differences result from including career age up to the 7th degree in the model (relative to the 5th degree).

      As a final sensitivity analysis, we reproduce Fig. 2, but rather than including career age as an exogenous variable up to its fifth power in our models for hiring and attrition, we include career age up to its third power. Findings under this parameterization are qualitatively very similar to those presented in Fig. 2, indicating that the results are robust to modest changes to model parameterization (shown in supplement Fig. S6).

      Far more detail in this and some interim results from each stage of the analysis would make the paper far more convincing. It currently has an air of "black box" too much of the analysis which would easily allow an unconvinced reader to discard the results.

      We have added more detailed descriptions of the methods to the main text. We hope that the changes made will address these concerns.

      Section C: You use the Leslie model to predict the future population. As the model is linear the population will either grow exponentially (most likely) or dwindle to zero. You mention you dealt with this by scaling the average value of H to keep the population at 2020 levels? This would change the ratio of hiring to attrition. How did this affect the timescale of the results. If a field had very minimal attrition (and hence grew massively over the time period of the dataset) the hiring rate would have to be very small too so there would be very little change in the gender balance. Did you consider running the model to steady state instead?

      We chose the 40 year window (2020-2060) for this projection analysis because 40 years is roughly the timespan of a full-length faculty career. In other words, it will take around 40 years for most of the pre-existing faculty from 2020 to retire, such that the new, simulated faculty will have almost entirely replaced all former faculty by 2060.

      For three out of five of our projection scenarios (OA, GNA, OA+ER), the point at which observed faculty are replaced by simulated faculty represents steady state. One way to check this intuition is to observe the asymptotic behavior of the trajectories in Fig. 3B; the slopes for these 3 scenarios nearly level out within 40 years.

      The other two scenarios (OA + IR, GNA+IR) represent situations where women’s representation among new hires is increasing each year. These scenarios will not reach steady state until women represent 100% of faculty. Accordingly, the steady state outcomes for these scenarios would yield uninteresting results; instead, we argue that it is the relative timescales that are interesting.

      What did you do to check that your predictions at least felt realistic under the fitted parameters? (see above for presenting the goodness of fit over the 10 years of the data).

      We ran the analysis suggested in a prior comment (Starting the model in 2011 how well does it fit the available data up to 2020?) and found that model outcomes were statistically indistinguishable from the observed 2020 faculty gender compositions for all 111 academic fields, plus the “All STEM” and “All non-STEM” aggregations.

      You only present the final proportion of women for each scenario. As mentioned earlier, models of this type have a tendency to lead to strange population distributions with wild age predictions and huge (or zero populations). Presenting more results here would assuage any worries the reader had about these problems. What is the predicted age distribution of men and women in the long term scenarios? Would a different method of keeping the total population in check have yielded different results? Interim results, especially from a model as complex as this one, rather than just presenting a final single number answer are a convincing validation that your model is a good one! Again, presenting this result will go a long way to convincing readers that your results are sound and rigorous.

      Thank you for this suggestion. We now include a figure that presents faculty age distributions for each projection scenario at 2060 against the observed faculty age distribution in 2020 (pictured below, and as Fig. S3 in the supplementary materials). We find that the projected age distributions are very similar to the observed distributions for natural sciences (shown) and for the additional academic domains. We hope this additional validation will inspire confidence in our model of faculty hiring and attrition for the reviewer, and for future readers.

      In Fig S3, line widths for the simulated scenarios span the central 95% of simulations.

      Other people have reached almost identical conclusions (albeit it with smaller data sets) that hiring is more important than attrition. It would be good to compare your conclusions with their work in the Discussion.

      We have revised the main text to cite the listed examples of similar studies. We thank the reviewer for bringing these relevant works to our attention.

      General comments:

      What thoughts have you given to non-binary individuals?

      Be careful how you use the term "gender diversity"! In many countries "Gender diverse" is a term used in data collection for non-binary individuals, i.e. Male, female, gender diverse. The phrase "hiring more gender diverse faculty" can be read in different ways! If you are only considering men and women then gender balance may be a better framework to use.

      We have added language to the main text which explicitly acknowledges that our analysis focuses on men and women due to limitations in our name-based gender tool, which only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      We have also taken additional care with referring to “gender diversity,” per reviewer 1’s point in their public review.

      Reviewer #2 (Recommendations For The Authors):

      Data availability: I did not see an indication that the dataset used here is publicly available, either in its raw format or as a summary dataset. Perhaps this is due to the sensitive nature of the data, but regardless of the underlying reason, the authors should include a note on data availability in the paper.

      The dataset used for these analyses were obtained under a data use agreement with the Academic Analytics Research Center (AARC). While these data are not publicly available, researchers may apply for data access here: https://aarcresearch.com/access-our-data.

      We also added a table to the supplemental materials (Tab. S3) that reports the estimated number of men and women in each of the 111 fields.

      Additionally, a variety of summary statistics based on this dataset are available online, here: https://github.com/LarremoreLab/us-faculty-hiring-networks/tree/main

      Gender classification: Was an existing package used to classify gender from names in the dataset, or did the authors develop custom code to do so? Either way, this code should be cited. I would also be curious to know what the error rate of these classifications are, and suggest that additional information on potential biases that might result from automated classifications be included in the discussion, under the section describing data limitations. The reliability of name-based gender classification is particularly of interest, as external gender classifications such as those applied on the basis of an individual's name - may not reflect the gender with which an individual self-identifies. In other words, while for many people their names may reflect their true genders, for others those names may only reflect their gender assigned at birth and not their self-perceived or lived gender identity. Nonbinary faculty are in particular invisibilized here (and through any analysis that assigns binary gender on the basis of name). While these considerations do not detract from the main focus of the study - which was to utilize an existing dataset classified only on the basis of binary gender to assess trends for women faculty-these limitations should be addressed as they provide additional context for the interpretation of the results and suggest avenues for future research.

      We use a free, open-source, and open-data python package called nomquamgender (Van Buskirk et al, 2023) to estimate the strengths of (culturally constructed) name-gender associations. For sufficiently strong associations with a binary gender, we apply those labels to the names in our data. We have updated the main text to make this approach more apparent.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      As we mentioned in response to the public review, we use a free and open source python package called nomquamgender to estimate the strengths of name-gender associations, and we apply gender labels to the names with sufficiently strong associations with a binary gender. This package is based on a paper by Van Buskirk et. al. 2023, “An open-source cultural consensus approach to name-based gender classification,” which documents error rates and potential biases.

      We have also added language to the main text which explicitly acknowledges that our approach only assigns binary (woman/man) labels to faculty. We point out that this is a compromise due to the technical limitations of name-based gender methodologies and is not intended to reinforce a gender binary.

      Page 1: The sentence beginning "A trend towards greater women's representation could be caused..." is missing a conjunction. It should likely read: "A trend towards greater women's representation could be caused entirely by attrition, e.g., if relatively more men than women leave a field, OR entirely by hiring..."

      We have edited the paragraph to remove the sentence in question.

      Pages 1-2: The sentence beginning "Although both types of strategy..." and ending with "may ultimately achieve gender parity" is a bit of a run-on; perhaps it would be best to split this into multiple sentences for ease of reading.

      We have revised this run-on sentence.

      Page 2: See comments in the public review about a methods section, the addition of which may help to improve clarity for the readers. Within the existing descriptions of what I consider to be methods (i.e., the first three paragraphs currently under "results"), some minor corrections could be added here. First, consider citing the source of the dataset in the line where it is first described (in the sentence "For these analyses, we exploit a census-level dataset of employment and education records for tenured and tenure-track faculty in 12,112 PhD-granting departments in the United States from 2011-2020.") It also may be helpful to include context here (or above, in the discussion about institutional analyses) about how "departments" can be interpreted. For example, how many institutions are represented across these departments? More information on how the authors eliminated the gendered aspect of patterns in their counterfactual model would be helpful as well; this is currently hinted at on page 4, but could instead be included in the methods section with a call-out to the relevant supplemental information section (S2B).

      We have added a citation to Academic Analytics Research Center’s (AARC) list of available data elements to the data’s introduction sentence. We hope this will allow readers to familiarize themselves with the data used in our analysis.

      Faculty department membership was determined by AARC based on online faculty rosters. 392 institutions are represented across the 12,112 departments present in our dataset. We have updated the main text to include this information.

      Finally, we have added a methods section to the main text, which includes information on how the gendered aspect of attrition patterns were eliminated in the counterfactual model.

      Page 2: Perhaps some indication of how many transitions from an out-of-sample institution might be helpful to readers hoping to understand "edge cases."

      In our analysis, we consider all transitions from out-of-sample institutions to in-sample institutions as hires, and all transitions away from in-sample institutions–whether it be to an out of sample institution, or out of academia entirely–as attritions. We choose to restrict our analysis of hiring and attrition to PhD granting institutions in the U.S. in this way because our data do not support an analysis of other, out-of-sample institutions.

      I also would have liked additional information on how many faculty switched institutions but remained "in-sample and in the same field" - and the gender breakdowns of these institutional changes, as this might be an interesting future direction for studies of gender parity. (For example, readers may be spurred to ask: if the majority of those who move institutions are women, what are the implications for tenure and promotion for these individuals?)

      While these mid-career moves are not counted as attritions in the present analysis, a study of faculty who switch institutions but remain (in-sample) as faculty could shed light on issues of gendered faculty retention at the level of institutions. We share the reviewer’s interest in a more in depth study of mid-career moves and how these moves impact faculty careers, and we now discuss the potential value of such a study towards the end of the paper. In fact, this subject is the topic of a current investigation by the authors!

      Page 3: I was confused by the statement that "of the three types of stable points, only the first point represents an equitable steady-state, in which men and women faculty have equal average career lengths and are hired in unchanging proportions." Here, for example, computer science appears to be close to the origin on Figure 1, suggesting that hiring has occurred in "unchanging proportions" over the study interval. However, upon analysis of Table S2, it appears that changes in hiring in Computer Science (+2.26 pp) are relatively large over the study interval compared to other fields. Perhaps I am reading too literally into the phrase that "men and women faculty are hired in unchanging proportions" - but I (and likely others) would benefit from additional clarity here.

      We had created an arrow along with the computer science label in Fig. 1, but it was difficult to see, which is likely the source of this confusion. This was our fault, and we have moved the “Comp. Sci.” label and its corresponding arrow to be more visible in Figure 1.

      Changes in women’s representation in Computer Science due to hiring over 2011 - 2020 was +2.26 pp as the reviewer points out, but, consulting Fig. 1 and the corresponding table in the supplement, we observe that this is a relatively small amount of change compared to most fields.

      Page 3: If possible it may be helpful to cite a study (or multiple) that shows that "changes in women's representation across academic fields have been mostly positive." What does "positive" mean here, particularly when the changes the authors observe are modest? Perhaps by "positive" you mean "perceived as positive"?

      We used the term positive in the mathematical sense, to mean greater than zero. We have reworded the sentence to read “women's representation across academic fields has been mostly increasing…” We hope this change clarifies our meaning to future readers.

      Page 3: The sentence that ends with "even though men are more likely to be at or near retirement age than women faculty due to historical demographic trends" may benefit from a citation (of either Figure S3 or another source).

      We now cite the corresponding figure in this sentence.

      Page 4: The two sentences that begin with "The empirical probability that a person leaves their academic career" would benefit from an added citation.

      We have added a citation to the sentences.

      Figure 3: Which 10 academic domains are represented in Panel 3B? The colors in appear to correspond to the legend in Panel 3A, but no indication of which fields are represented is provided. If possible, please do so - it would be interesting and informative to be able to make these comparisons.

      This was not clear in the initial version of Fig. 3B, so we now label each domain. For reference, the domains represented in 3B are (from top to bottom):

      ● Health

      ● Education

      ● Journalism, Media, Communication

      ● Humanities

      ● Social Sciences

      ● Public Administration and Policy

      ● Medicine

      ● Business

      ● Natural Sciences

      ● Mathematics and Computing

      ● Engineering

      Page 6: Consider citing relevant figure(s) earlier up in paragraph 2 of the discussion. For example, the first sentence could refer to Figure 1 (rather than waiting until the bottom of the paragraph to cite it).

      Thank you for this suggestion, we now cite Fig. 1 earlier in this discussion paragraph.

      Page 10: A minor comment on the fraction of women faculty in any given year-the authors assume that the proportion of women in a field can be calculated from knowing the number of women in a field and the number of men. This is, again, true if assuming binary genders but not true if additional gender diversity is included. It is likely that the number of nonbinary faculty is quite low, and as such would not cause a large change in the overall proportions calculated here, but additional context within the first paragraph of S1 might be helpful for readers.

      We have added additional context in the first paragraph of S1, explaining that an additional term could be added to the equation to account for nonbinary faculty representation if our data included nonbinary gender annotations. Thank you for making this point.

      Page 10: Please include a range of values for the residual terms of the decomposition of hiring and attrition in the sentence that reads "In Figure S1 we show that the residual terms are small, and thus the decomposition is a good approximation of the total change in women's representation."

      These residual terms range from -0.51pp to 1.14pp (median = 0.2pp). We have added this information to the sentence in question.

      Page 12: It may be helpful to readers to include a description of the information contained in Table S2 in the supplemental text under section S3.

      We refer to table S2 twice in the main text (once in the observational findings, and once for the counterfactual analysis), and the contents of table S2 are described thoroughly in the table caption.

      Reviewer #3 (Recommendations For The Authors):

      (1) There is a potential limitation in the generalizability of the findings, as the study focuses exclusively on US academia. Including international perspectives could have provided a more global understanding of the issues at hand.

      The U.S. focus of this study limits the generalizability of our findings, as non-U.S. other faculty may exhibit differences in hiring patterns, retention patterns, and current demographic representations. We have added a discussion of this limitation to the manuscript. Unfortunately, our data do not support international analyses of hiring and attrition.

      (2) I am not sure that everyone who disappeared from the AARC dataset could be count as "attrition" from academia. Indeed, some who disappeared might have completely left academia once they disappeared from the AARC dataset. Yet, there's also the possibility that some professors left for academic positions in countries outside of the US, or US institutions that are not included in the AARC dataset. These individuals didn't leave academia. Furthermore, it is also possible that these scholars who moved to an institution outside of US or not indexed by AARC are gender specific. Therefore, analyses that this study conducts should find a way to test whether the assumption that anyone who disappeared from AARC is indeed valid. If not, how will this potentially challenge the current conclusions?

      The reviewer makes an important point: faculty who move to faculty positions in other countries and faculty who move to non-PhD granting institutions, or to institutions that are otherwise not included in the AARC data are all counted as attritions in our analysis. We intentionally define hiring and attrition broadly to include all cases in which faculty join or leave a field or domain within our dataset.

      The types of transitions that faculty make out of the tenure track system at PhD granting institutions in the U.S. may correlate with faculty attributes, like gender. For example, women or men may be more likely to transition to tenure track positions at non-U.S. institutions. Nevertheless, these types of career transition represent an attrition for the system of study, and a hire for another system. Following this same logic, faculty who transition from one field to another field in our analysis are treated as an attrition from the first field and a hire into the new field.

      By focusing on “all-cause” attrition in this way, we are able to make robust insights for the specific systems we consider (e.g.,, STEM and non-STEM faculty at U.S. PhD granting institutions), without being roadblocked by the task of annotating faculty departures and arbitrating which should constitute “valid” attritions.

      (3) It would be very interesting to know how much of the attribution was due to tenure failure. Previous studies have suggested that women are less likely to be granted tenure, which makes me wonder about the role that tenure plays in the gendered patterns of attrition in academia.

      We note that faculty attrition rates start low and then reach a peak around 5-7 years after earning PhD, and then decline until around 15-20 years post-PhD, after which, attrition rates increase as faculty approach retirement. The first local maximum appears to coincide roughly with the tenure clock timing, but we can only speculate that these attritions are tenure related. Our dataset is unfortunately not equipped to determine the causal mechanisms driving attrition.

      We reproduce the attrition risk curve in the supplementary materials, Fig. S4:

      (4) The dataset used doesn't fully capture the complexities of academic environments, particularly smaller or less research-intensive institutions (regional universities, historically black colleges and universities, and minority-serving institutions). This could be potentially added to the manuscript for discussions.

      We have added this point to the description of this study’s limitations in the discussion.

    1. Click on the button below to have the computer execute the main method in the following class. Then, change the code to print your name. Be sure to keep the starting " and ending ". Click on the button to run the modified code.

      Click on the run button button below to have the computer execute the main method in the following class. Then, change the code to print your name. Be sure to keep the starting " and ending ". Click on the run button button to run the modified code.

    2. Special words—also called keywords—such as public, class, and if must be in lowercase, but class names such as System and String are capitalized.

      If you forget to capitalize the special words, it will make errors to the whole code.

    1. Résumé de la vidéo [00:00:05][^1^][1] - [00:23:27][^2^][2] : Cette vidéo présente une conférence de Thomas Rohmer sur les mutations de la famille en France et leur impact sociologique. Il aborde les changements démographiques depuis les années 60, l'évolution des structures familiales, et les défis contemporains liés à la parentalité et à l'individualisme.

      Points forts : + [00:00:05][^3^][3] Introduction à la sociologie de la famille * Présentation de l'approche sociologique * Distinction entre l'adolescence et la famille * Importance des grands-parents dans la vie des adolescents + [00:00:47][^4^][4] Mutations de la famille depuis les années 60 * Augmentation des unions libres et divorces * Croissance des familles monoparentales et recomposées * Interrogations sur le mariage de même sexe et la parentalité + [00:02:00][^5^][5] Opposition entre valeurs familiales et individuelles * Débat entre conservateurs et progressistes * Importance de ne pas choisir entre l'individu et la famille * La famille comme institution et objet idéologique + [00:05:00][^6^][6] La famille comme institution essentielle * La famille n'est pas une simple collection d'individus * Rôle des systèmes symboliques de parenté * Importance des valeurs communes et des règles juridiques + [00:09:00][^7^][7] Changements démographiques et leurs conséquences * Allongement de l'espérance de vie * Âge au premier mariage et à la naissance du premier enfant * Impact sur la procréation médicalement assistée + [00:14:00][^8^][8] Remise en cause du modèle de la famille conjugale * Évolution depuis le code Napoléon de 1804 * Inégalités de genre et autorité parentale * Implosion du modèle familial traditionnel depuis les années 60

      Résumé de la vidéo [00:23:29][^1^][1] - [00:47:50][^2^][2]:

      La vidéo traite de l'évolution des relations familiales et de la filiation en France, soulignant l'importance croissante de l'enfant dans la structure familiale et les changements dans la perception de la parentalité et du mariage.

      Points forts: + [00:23:29][^3^][3] L'amour inconditionnel et la reconnaissance de l'enfant * L'enfance est valorisée comme le fondement de la personnalité adulte * Les besoins spécifiques des enfants et adolescents sont de plus en plus reconnus * La préparation de l'avenir de l'enfant devient centrale dans la famille + [00:26:58][^4^][4] Les valeurs familiales et la diversité des relations * La famille reste une valeur clé malgré les perceptions de crise * Les modes de vie et valeurs des mariés et concubins deviennent similaires * Les familles sont confrontées à des ruptures et recompositions fréquentes + [00:33:55][^5^][5] Le temps et les inégalités sociales dans les familles * Le rapport au temps est un facteur sous-jacent aux difficultés familiales * La capacité à lier le présent au passé et au futur est cruciale * Les solidarités familiales peuvent accentuer les inégalités sociales + [00:41:29][^6^][6] La recomposition du permis et de l'interdit en matière sexuelle * Le consentement devient le critère principal de la sexualité autorisée * La question de la violence sexuelle est liée aux transformations de la parenté * Il est nécessaire de repenser l'ordre juridique et moral autour de la sexualité

      Résumé de la vidéo [00:47:51][^1^][1] - [00:55:29][^2^][2]:

      La vidéo présente une conférence sur la signification et l'institution de la signification dans le langage, en mettant l'accent sur la relation entre les parents, les enfants et la loi. Elle aborde également les défis juridiques liés à la présomption d'innocence et la véracité des victimes dans les affaires de violences sexuelles.

      Points forts: + [00:47:51][^3^][3] La signification dans le langage * Les enfants voient leurs parents comme les maîtres de la signification * Importance de distinguer entre ce que dit le parent et la loi + [00:49:14][^4^][4] Défis juridiques et consentement * Discussion sur la création d'un nouvel ordre normatif basé sur le consentement * Préoccupations concernant la présomption d'innocence et la charge de la preuve + [00:52:01][^5^][5] Présomption de véracité * Nécessité d'accorder un crédit de véracité à la parole des victimes * Recherche de solutions pour ne pas condamner les victimes au silence + [00:54:39][^6^][6] Recompositions du permis et de l'interdit * Discussion sur les recompositions du permis et de l'interdit dans la société * L'importance de maintenir les garanties démocratiques tout en cherchant des solutions

    1. DealsThe O.G PizzaCitro Signature PizzaPrawn PizzaChicken PizzaVegetarian PizzaVegan PizzaAntipastiPastaBreadsSaladsWingsDessertSoft Drinks var last_io_selected = new Array(); $(document).ready(function() { var load_once; if (typeof code_happened === 'undefined') { window.code_happened = true; load_once = true; }else{ load_once = false; } //WEB-589 Allow upto 99 items in 1 selection var qty_selections = ''; for(var i = 1; i <= 99; i++) { qty_selections += '<option>'+i+'</option>'; } $("#item-buttons .qty-select.qty").html(qty_selections); var current_width = $(window).width(); var current_height = $(window).height(); if(current_width < 481){ var current_height1 = current_height - 215; $("#menu-items .modal-popup .modal-body").css('max-height', current_height1 +'px'); $("#menu-items .modal-popup .modal-body").css('min-height', current_height1 +'px'); } if(current_width < 321){ var current_height2 = current_height - 225; $("#menu-items .modal-popup .modal-body").css('max-height', current_height2 +'px'); $("#menu-items .modal-popup .modal-body").css('min-height', current_height2 +'px'); } if(load_once){ $(".qty-btn-popup-minus").live("click", function(){ var parent_div = $(this).closest('li').attr('id'); parent_div = (typeof parent_div !== "undefined" && parent_div !== false) ? "#"+parent_div+" " : ""; var PLU = $(this).attr('ref'); var group_id=$(this).attr('ref-group-id'); var counter = $(parent_div+'#qty-'+group_id).text(); counter--; if (counter <= 0){ counter = 1; } if (group_id <= 0){ $(parent_div+'#qty-'+PLU).text(counter); }else{ $(parent_div+'#qty-'+group_id).text(counter); } }); $(".qty-btn-popup-plus").live("click", function(){ var parent_div = $(this).closest('li').attr('id'); parent_div = (typeof parent_div !== "undefined" && parent_div !== false) ? "#"+parent_div+" " : ""; var PLU = $(this).attr('ref'); var group_id=$(this).attr('ref-group-id'); var counter = $(parent_div+'#qty-'+group_id).text(); counter++; if (counter >= 99){ counter = 99; } if (group_id <= 0){ $(parent_div+'#qty-'+PLU).text(counter); }else{ $(parent_div+'#qty-'+group_id).text(counter); } }); } var option_id; function priceBaseOnOrderType(parent_div) { order_type = $("#order-type-bt .active").val(); // if order_type is empty or undefined // order_type is undefined when store is offline if(order_type == null || order_type == 'undefined') order_type = $("#current_order_type_holder").val(); $("#"+ parent_div +" .extra-toppings-checkbox").each(function() { var price = $(this).attr('value'); var plu = $(this).attr('plu'); if(price == 0){ price = order_type == 'pickup' ? $(this).data('sell-shop') : (order_type == 'delivery' ? $(this).data('sell-delivery') : $(this).data('sell-table')); $(this).attr('value', price); $("#"+ parent_div +" #condiment-price-"+plu).html(price); } if(price == null || price == 'undefined' || !price){ price = 0; $("#"+ parent_div +" #condiment-price-"+plu).html(price); } }); return false; } function get_condiments(plu, parent_div, currentToppings, extraToppings, defaultToppings){ $("#"+parent_div+" .popup-condiments").show(); $("#"+parent_div+" .popup-toppings").css("opacity", "0.3"); $("#"+parent_div+" .lds-ring").show(); $("#"+parent_div+" .modal-footer .footer_overlay").show(); $.ajax({ type: "POST", url: "core/ajax/get_popup_toppings.php", data: {"cid": "10914", "plu": plu, "currenttoppings": currentToppings, "extratoppings": extraToppings, "defaulttoppings": defaultToppings}, success: function(data) { if(data){ $(".popup-toppings").html(''); $("#"+parent_div+" .lds-ring").hide(); $("#"+parent_div+" .popup-toppings").html(data); //WEB-395 UPSELL var upsell_id = parent_div.substring(parent_div.lastIndexOf("_")+1); var upsell_container = $("#menu-"+upsell_id+"-upsell-items"); if(upsell_container.length > 0) { $("#"+parent_div+" .popup-toppings").append(upsell_container.html()); $("#"+parent_div+" .upsell-item-chkbox").die("change").live("change", function(){ var popup_total = parseFloat($("#"+parent_div+" .popup-item-price").text().substring(1)); var upsell_item_price = parseFloat($(this).data("price")); if($(this).is(":checked")) { popup_total += upsell_item_price; } else { popup_total -= upsell_item_price; } $("#"+parent_div+" .popup-item-price").text("$"+(popup_total).toFixed(2)); }); } $("#"+parent_div+" .popup-toppings").css("opacity", "1"); } }, complete: function(data) { priceBaseOnOrderType(parent_div); $("#"+parent_div+" .modal-footer .footer_overlay").hide(); // Reset Styles for WEB-573 Line separation on the item modal $('#'+ parent_div + ' .item-option-radio-menu').css("border-bottom", "none"); $('#'+ parent_div + ' .popup-current-toppings').css("border-top", "none"); $('#'+ parent_div + ' #extra-toppings').css("border-top", "none"); var line_chk_01 = $('#'+ parent_div +' .menu-item-option-popup').children().length > 0; var line_chk_02 = $('#'+ parent_div +' .item-option-radio-menu').children().length > 0; var line_chk_03 = $('#'+ parent_div + ' .popup-current-toppings').children().length > 0; var line_chk_04 = $('#'+ parent_div + ' #extra-toppings').children().length > 0; if(line_chk_01 == true && $('#'+ parent_div + ' .item-option-radio-menu').length > 0) { $('#'+ parent_div + ' .item-option-radio-menu')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02) && $('#'+ parent_div + ' .popup-current-toppings').length > 0) { $('#'+ parent_div + ' .popup-current-toppings')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02 || line_chk_03) && $('#'+ parent_div + ' #extra-toppings').length > 0) { $('#'+ parent_div + ' #extra-toppings')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02 || line_chk_03 || line_chk_04) && $('#'+ parent_div + ' .upsell-header').length > 0) { $('#'+ parent_div + ' .upsell-header')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } } }); } function item_option_list(data, groupId, menuId, io, isMultiple, multipleItemGrpId, itemCtr){ var counter = 0, // for padding of the right and left side of the col-sm-6 input_type = "radio", padding='', checked='', active='', item_option_html='', option_name = data[0].option_name, option_display_name = data[0].option_display_name, min_option = (data[0].min_permitted !== undefined) ? data[0].min_permitted : 0, max_option = (data[0].max_permitted !== undefined) ? data[0].max_permitted : 1; if(!isMultiple){ item_option_html += '<p style="color:black; font-size:13.5px; width:100%">'+(option_display_name ? option_display_name : option_name)+'</p>'; } else{ var io_required=''; if(min_option <= 0){ io_required = "Choose up to "+max_option; } else if(min_option == max_option){ io_required = "Required"; io_required += (min_option > 1) ? " - Choose "+min_option : ""; } else{ io_required = "Required - Choose between "+min_option+" and "+max_option; } item_option_html += '<div'+(itemCtr > 1 ? ' style="margin-top:15px;"' : '')+' class="multi-option-select">' +'<div class="multi-option-name item-option-group-name-'+multipleItemGrpId+'" style="position:relative; float:left; width:100%;">' +'<p style="color:black; font-size:15px; font-weight:600; padding-bottom:0;">'+(option_display_name ? option_display_name : option_name)+'</p>' +'<span style="font-size:15px; color:#a1a1a1;">'+io_required+'</span>' +'</div>'; } $.each(data, function(key, value){ checked=''; active=''; counter++; if(counter == 1){ padding = 'padding-right:15px; padding-left:0px;'; }else{ padding = 'padding-right:0px; padding-left:15px;'; counter = 0; } if(!isMultiple){ if(value.default_item_option_id == value.id){ checked = 'checked'; active = 'https://deliverit-online-resources-prd.s3.ap-southeast-2.amazonaws.com/templates/template4/img/icon-check.png'; } input_type = "radio"; } else{ input_type = "checkbox"; } var price_txt = (value.item_price > 0) ? ' - $' + value.item_price : ''; item_option_html += '<div class="input-group-radio item-option-input-group col-sm-6" style="'+padding+'">' +'<input type="'+input_type+'" style="display:none;" ref="'+value.id+'" name="item-option-radio-'+groupId+(isMultiple ? "-"+multipleItemGrpId : "")+'" class="item-option-radio-list" value="'+value.item_price+'" id="item-option-'+groupId+'-'+menuId+'-'+value.id+'"'+((!isMultiple && value.default_item_option_id != 0) ? " default-io='"+value.default_item_option_id+"'" : "")+(isMultiple ? " multiple-io='true' mio-id='"+multipleItemGrpId+"' min-io='"+value.min_permitted+"' max-io='"+value.max_permitted+"'" : "")+' '+checked+'>' +'<label for ="item-option-'+groupId+'-'+menuId+'-'+value.id+'" style="font-weight:normal !important; padding:5px 10px; border-radius:5px; user-select:none; -moz-user-select:none; -webkit-user-select:none; -ms-user-select:none; display:flex; justify-content: space-between" class="input-group-label input-group-label-template2 input-group-label-default">'+value.item_name+price_txt+'<img src="'+active+'" class="check-img-popup" style="float:right; align-self:center"></label>' +'</div>'; }); if(isMultiple){ item_option_html += '</div>'; } $(item_option_html).appendTo(io); } if(load_once){ $(".add-button-popup").live("click", function(){ var parent_div = $(this).closest('li').attr('id'); var parent_div_class = $(this).closest('li').attr('class'); var plu = $(this).attr('ref'); $("#"+ parent_div +" #toppings_left").hide(); $(".popup-orig-price").html('0'); $(".popup-item-price").html(''); var t = 0; $("#"+ parent_div +" .input-group-label").each(function() { if($(this).hasClass('active')){ t = 1; } }); if(t == 0){ $("#"+ parent_div +" .input-group-label").each(function() { $(this).addClass('active'); return false; }); } if($("#"+ parent_div +" .item-option-radio-list[multiple-io]").length > 0){ $("#"+ parent_div +" .item-option-radio-list[multiple-io]").each(function(){ if($(this).siblings(".input-group-label").hasClass("active")){ $(this).siblings(".input-group-label").removeClass("active"); } }); } var popup_price = $("#"+ parent_div +" .active #popup-price").html(); var hide_toppings = $("#"+ parent_div +" .active").attr('hide_toppings'); var active_plu = $("#"+ parent_div +" .active").parent().find('.radio-button-popup').attr('plu'); $("#"+ parent_div + " .modal-footer .qty").html('1'); $("#"+ parent_div + " .popup-item-price").html(popup_price); if(hide_toppings == 0){ get_condiments(active_plu, parent_div, "", "", ""); } get_item_option(parent_div); }); } $(".qty-btn-popup").live("click", function(){ var parent_div = $(this).closest('li').attr('id'); var popup_price = $("#"+ parent_div +" .active #popup-price").html(); if(popup_price){ popup_price = parseFloat(popup_price.replace('$', '')); var popup_qty = $(this).parent().find('.qty').html(); var toppings_price = $("#"+ parent_div +" .popup-orig-price").html(); toppings_price = parseFloat(toppings_price.replace('$', '')); var total_price = parseFloat(popup_price) * parseInt(popup_qty); var toppings_toppings_price = parseFloat(toppings_price) * parseInt(popup_qty); var upsell_total = 0; $("#"+parent_div+" .upsell-item-chkbox:checked").each(function(){ var upsell_price = parseFloat($(this).data("price")); upsell_total += upsell_price; }); total_price = total_price + toppings_toppings_price + upsell_total; $(this).parent().parent().parent().find('.popup-item-price').html("$"+total_price.toFixed(2)); get_item_option(parent_div); } }); $(".item-option-radio-list").live("click", function(){ var parent_div = $(this).closest('li').attr('id'); var multiple_io = $(this).attr('multiple-io'); var mio_id = $(this).attr('mio-id'); if(typeof multiple_io === "undefined" || multiple_io === false){ $('#'+ parent_div +' .item-option-radio-list').attr('checked', false); $(this).attr('checked', true); } else{ var min_io = $(this).attr('min-io'); var max_io = $(this).attr('max-io'); if($("#"+parent_div+" [name='"+$(this).attr('name')+"']:checked").length >= min_io){ $(this).parent().parent().css({"padding":"", "border":""}); } if(max_io == 1 && $("#"+parent_div+" [name='"+$(this).attr('name')+"']:checked").length > 1){ $("#"+parent_div+" [name='"+$(this).attr('name')+"']").attr('checked', false); $(this).attr('checked', true); } else if($("#"+parent_div+" [name='"+$(this).attr('name')+"']:checked").length > max_io){ $("#"+last_io_selected[mio_id]).attr('checked', false); } last_io_selected[mio_id] = $(this).attr('id'); } checked_io($(this), "template4"); get_item_option(parent_div); }); $(".menu-option-radio-list").live("click", function(){ // code for the new settings called customise_popup var group_id = $(this).attr('ref'); var menu_id = $(this).attr('menu-id'); var plu = $(this).attr('plu'); var old_plu = $(this).siblings('.input-group-label').hasClass('active') var hide_toppings = $(this).siblings('.input-group-label').attr('hide_toppings'); var parent_div = $(this).closest('li').attr('id'); var qty = $("#"+ parent_div + " .qty").html(); var io = $("#"+ parent_div + " .item-option-radio"); $("#"+ parent_div +" #toppings_left").hide(); var default_toppings = $("#"+parent_div+" .current-toppings-checkbox").map(function(){ return $(this).attr("plu"); }).get(); var current_toppings = $("#"+parent_div+" .current-toppings-checkbox:checked").map(function(){ return $(this).attr("plu"); }).get(); var extra_toppings = $("#"+parent_div+" .extra-toppings-checkbox:checked").map(function(){ return $(this).attr("plu"); }).get(); if(old_plu == false){ if(plu){ if(hide_toppings == 0){ get_condiments(plu, parent_div, current_toppings, extra_toppings, default_toppings); }else{ $("#"+ parent_div + " .popup-condiments").hide(); $("#"+ parent_div + " .popup-toppings").empty(); } } $(".popup-item-price").html(''); $("#"+ parent_div + " .popup-orig-price").html('0'); var popup_price = $(this).siblings('.input-group-label').children('#popup-price').html(); popup_price = parseFloat(popup_price.replace('$', '')); var upsell_total = 0; $("#"+parent_div+" .upsell-item-chkbox:checked").each(function(){ var upsell_price = parseFloat($(this).data("price")); upsell_total += upsell_price; }); var total_price = (parseFloat(popup_price) * parseInt(qty)) + upsell_total; $("#"+ parent_div + " .popup-item-price").html('$' + total_price.toFixed(2)); } // for the icon checked besides the label of radio button $(this).parent().parent().find('.check-img-popup').attr('src',""); $(this).parent().parent().find('.input-group-label').removeClass('active'); $(this).siblings('.input-group-label').children('.check-img-popup').attr("src","https://deliverit-online-resources-prd.s3.ap-southeast-2.amazonaws.com/templates/template4/img/icon-check.png"); $(this).siblings('.input-group-label').addClass('active'); //fix for safari img shown as broken image $(this).parent().parent().find('.check-img-popup').css('visibility',"hidden"); $(this).siblings('.input-group-label').children('.check-img-popup').css('visibility',"visible"); $("#"+parent_div+' input[name="menu-item-option-radio-'+group_id+'"]').attr('checked', false); $(this).attr('checked', true); $("#" + group_id).attr('ref', $(this).val()); if(old_plu == false){ $('#'+ parent_div + '.item-option-radio-menu').hide(); $("#"+parent_div+" .radio-button-popup").attr('disabled','disabled'); // to prevent multiple item option when radio button is spammed if (io) { io.empty(); $.ajax({ url: 'core/ajax/item_options.php', type: "POST", data: { "plu": plu }, dataType: 'json', success: function (data) { // For item-options that was hidden because of no item option on default size // We need to show it else hide if no data was returned if(Object.keys(data).length >= 1 && data){ $('#'+ parent_div + ' .item-option-radio-menu').show(); $(io).fadeIn(0); if(data.hasOwnProperty('multiple_io')){ delete data['multiple_io']; var io_ctr = 1; $.each(data, function(key, value){ item_option_list(value, group_id, menu_id, io, true, key.trim(), io_ctr); io_ctr++; }); } else{ item_option_list(data, group_id, menu_id, io, false, "", 0); } }else{ $('#'+ parent_div + ' .item-option-radio-menu').hide(); } if (!$("#"+parent_div+" input[name=item-option-radio-"+group_id+"]:checked").val()) { // if no item option is checked, we make the first item the default //fix for safari img shown as broken image $("#"+parent_div+" input:radio[name=item-option-radio-"+group_id+"]:not(:disabled):first").attr('checked', true); $("#"+parent_div+" input:radio[name=item-option-radio-"+group_id+"]:not(:disabled):first").siblings(".input-group-label").children('.check-img-popup').attr("src","https://deliverit-online-resources-prd.s3.ap-southeast-2.amazonaws.com/templates/template4/img/icon-check.png").css('visibility',"visible"); $("#add-popup-"+group_id+"-"+menu_id).find("input:not(:checked)").siblings(".input-group-label").find(".check-img-popup").css('visibility',"hidden"); } $("#"+parent_div+" .radio-button-popup").attr('disabled', false); }, complete: function (data) { get_item_option(parent_div); // Reset Styles for WEB-573 Line separation on the item modal $('#'+ parent_div + ' .item-option-radio-menu').css("border-bottom", "none"); $('#'+ parent_div + ' .popup-current-toppings').css("border-top", "none"); $('#'+ parent_div + ' #extra-toppings').css("border-top", "none"); var line_chk_01 = $('#'+ parent_div +' .menu-item-option-popup').children().length > 0; var line_chk_02 = $('#'+ parent_div +' .item-option-radio-menu').children().length > 0; var line_chk_03 = $('#'+ parent_div + ' .popup-current-toppings').children().length > 0; var line_chk_04 = $('#'+ parent_div + ' #extra-toppings').children().length > 0; if(line_chk_01 == true && $('#'+ parent_div + ' .item-option-radio-menu').length > 0) { $('#'+ parent_div + ' .item-option-radio-menu')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02) && $('#'+ parent_div + ' .popup-current-toppings').length > 0) { $('#'+ parent_div + ' .popup-current-toppings')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02 || line_chk_03) && $('#'+ parent_div + ' #extra-toppings').length > 0) { $('#'+ parent_div + ' #extra-toppings')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } if((line_chk_01 || line_chk_02 || line_chk_03 || line_chk_04) && $('#'+ parent_div + ' .upsell-header').length > 0) { $('#'+ parent_div + ' .upsell-header')[0].style.setProperty("border-top", "1px solid #00000038", "important"); } } }); } } }); function isNumeric(n) { return !isNaN(parseFloat(n)) && isFinite(n); } var items = {}; var free_toppings_list = []; function calculateItems() { var total = 0; for (var plu in items) { total += items[plu]; } return total; } if(load_once){ $(".extra-toppings-checkbox").live('change', function () { var parent_div = $(this).closest('.modal-popup').closest('li').attr('id'); var popup_price = $("#"+ parent_div +" .popup-item-price").html(); var toppings_price = $("#"+ parent_div +" .popup-orig-price").html(); var qty = $("#"+ parent_div + " .qty").html(); var counter_free_extras = parseInt( $("#"+ parent_div +" #num-free-toppings").html()); var max_toppings; var num_free_extra = 0; var price = 0; if($("#"+ parent_div +" #max_toppings").length){ max_toppings = $("#"+ parent_div +" #max_toppings p").html(); }else{ max_toppings = 12; } if($("#"+ parent_div +" #num-free-toppings").length){ num_free_extra = $("#"+ parent_div +" #num-free-toppings").html(); num_free_orig = $("#"+ parent_div +" #num-free-orig").html(); } var plu = $(this).attr('plu'); var cur_toppings = $("#"+parent_div+" .extra-toppings-checkbox:checked").length; //will happen if there is no set limit var remaining_ = max_toppings - cur_toppings; // the text is valid since it always being updated by priceBase function // PREVENT ADDING MORE ITEMS if (remaining_ < 0) { $("#"+ parent_div +" #toppings_left").show().delay(1000).fadeOut(); $("#"+ parent_div +" #toppings_left").html("You have reached the extras limit of "+max_toppings); $(this).prop('checked', false); return false; }else{ $("#"+ parent_div +" #toppings_left").hide(); $("#"+ parent_div +" #toppings_left").html(''); } // Update the price fetching, now respects the order type // Please NOTE that overridden condiment prices will reflect on both pickup/delivery var order_type = 'pickup'; price = $(this).attr('value'); if(!price || price <= 0){ price = order_type == 'pickup' ? $(this).data('sell-shop') : (order_type == 'delivery' ? $(this).data('sell-delivery') : $(this).data('sell-table')); } if($("#"+ parent_div +" #num-free-toppings").length){ if (num_free_extra > 0 && this.checked) { price = 0; $("#"+parent_div+" #num-free-toppings").html(parseFloat(num_free_extra) - 1); $(this).addClass('free_item'); }else if(num_free_extra == 0 && this.checked){ $("#"+parent_div+" #num-free-toppings").html('0'); $(this).removeClass('free_item'); }else if(cur_toppings < num_free_orig){ price = 0; $("#"+parent_div+" #num-free-toppings").html(parseFloat(num_free_extra) + 1); $(this).removeClass('free_item'); } else if(!this.checked) { //Check if the checkbox is uncheck counter_free_extras += 1; } } if(price == null || price == 'undefined' || !price){ price = 0; } var popup_orig = parseFloat($("#"+ parent_div +" .active #popup-price").html().replace('$', '')); if(this.checked){ price = price; }else{ if(counter_free_extras > 0 && num_free_extra <= 0) { $("#"+parent_div+" .toppings-checkbox::checked").addClass("free_item"); } price = '-'+price; } popup_price = parseFloat(popup_price.replace('$', '')); var item_total = 0; var toppings_total = 0; toppings_total = parseFloat(price) + parseFloat(toppings_price); items[plu] = parseFloat(price); $("#"+ parent_div + " .popup-orig-price").html(toppings_total.toFixed(2)); price = (qty) ? (price * qty) : price; item_total = parseFloat(price) + popup_price; //Check if the free extras exceed and it will start add the price of toppings var upsell_total = 0; $("#"+parent_div+" .upsell-item-chkbox:checked").each(function(){ var upsell_price = parseFloat($(this).data("price")); upsell_total += upsell_price; }); if(counter_free_extras > 0 && num_free_extra <= 0) { $("#"+ parent_div + " .popup-item-price").html('$' + (popup_orig+upsell_total).toFixed(2)); } else { $("#"+ parent_div + " .popup-item-price").html('$' + (item_total+upsell_total).toFixed(2)); } if($("#"+ parent_div +" #max_toppings").length){ var counter; if(this.checked){ counter = 1; }else{ counter = '-'+1; } max_toppings = $("#"+ parent_div +" #max_toppings span").html(); var toppings_left = max_toppings - counter; $("#"+parent_div+" #max_toppings span").html(toppings_left); } }); } $(".customise-add-button").click(function () { if($("input[name='storestatus']").val()=='offline'){ $.prompt($('#offline-alert-txt').html()); return; } var menu_id = $(this).closest('#menu_items').attr('data-menuid'); var parent_div = $(this).closest('.modal-popup').closest('li').attr('id'); var modal_div = $(this).closest('.modal-popup').attr('id'); var price = $("#"+ parent_div +" .popup-item-price").html(); price = parseFloat(price.replace('$', '')); var PLU = $(this).attr('ref'); var qty = $("#"+ parent_div +" .qty").html(); option_id=''; var mio_ids = [], mio_msg = [], mio_req = 0; if($("#"+ parent_div +" .item-option-radio-list").length > 0){ $("#"+ parent_div +" .item-option-radio-list").each(function(){ if($(this).is(':checked')){ option_id += (option_id.trim()!="") ? "," : ""; option_id += $(this).attr('ref'); } if(typeof $(this).attr('mio-id') !== "undefined" && $(this).attr('mio_id') !== false){ mio_ids.push($(this).attr('mio-id')); } mio_ids = [... new Set(mio_ids)]; }); if(typeof $("#"+ parent_div +" .item-option-radio-list:first").attr('multiple-io') !== "undefined" && $("#"+ parent_div +" .item-option-radio-list:first").attr('multiple-io') !== false){ var io_div; $.each(mio_ids, function(key, val){ var io_selected = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:checked").length, io_min = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").attr('min-io'), io_max = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").attr('max-io'), io_name = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").parent().siblings("div.item-option-group-name-"+val).find("p").text(), io_container = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").parent().parent(); io_container.css({"padding":"", "border":""}); if(io_selected < io_min){ io_container.css({"padding":"5px", "border":"1px solid red"}); mio_req++; if(!io_div){ io_div = io_container; } } else if(io_selected > io_max){ mio_msg.push("'"+io_name+"' can't have more than "+io_max); } }); if(mio_req > 0){ $.prompt("Please complete all sections to finish your order"); $("#"+modal_div+" .modal-body").scrollTop(0).scrollTop(io_div.position().top - 25); return false; } if(mio_msg.length > 0){ $.prompt(mio_msg.join("<br/>")); return false; } $("#"+parent_div+" .item-option-radio-list").attr('checked', false); checked_io($("#"+parent_div+" .item-option-radio-list"), "template4"); } else{ var default_io; default_io = $("#"+ parent_div +" .item-option-radio-list:checked").attr('default-io'); if(typeof default_io !== "undefined" && default_io !== false){ $("#"+parent_div+" .item-option-radio-list[ref='"+default_io+"']").attr('checked', true); } else{ $("#"+parent_div+" .item-option-radio-list:not(:disabled):first").attr('checked', true); } checked_io($("#"+parent_div+" .item-option-radio-list:checked"), "template4"); } } var topping_array = new Array(3); var i = 0; var f = free_toppings_list.slice(); // to prevent mutation of the original list f.sort(); $("#"+ parent_div + " .toppings-checkbox").each(function () { if ($(this).hasClass('current_item') || ($(this).hasClass('extra-toppings-checkbox') && this.checked)){ var extra_plu = $(this).attr('plu'); var extra_price = parseFloat(this.value); var unit_price = parseFloat(this.value); var extra_qty = (this.checked == true ? 1 : -1); var is_current = $(this).hasClass('current_item'); var has_chargeable = false; if (!extra_price) { extra_price = 0; } if (!is_current || extra_qty < 0 || extra_qty > 1) { if (is_current && extra_qty > 1) { extra_qty--; } if (is_current && extra_qty < 0) { extra_price = 0; }else if($(this).hasClass('free_item') || is_current){ extra_price = 0; }else{ extra_price = unit_price; } topping_array[i] = new Array(3); topping_array[i][0] = extra_plu; topping_array[i][1] = (extra_qty * qty); topping_array[i][2] = extra_price; i++; // we add it after if (has_chargeable == true) { //increase the counter to prevent overwriting this index topping_array[i] = new Array(3); topping_array[i][0] = extra_plu; topping_array[i][1] = 1; topping_array[i][2] = 0; i++; } } } }); var payload = {"PLU": PLU, "qty": qty, "price": price, "option_id": option_id, "topping_array[]": topping_array, "menu_id" : menu_id}; $.ajax({ url: 'core/mybasket.php', type: "POST", data: payload, success: function (data) { $('#view-basket').html(data); get_cart_total(); if (!$('#free_item_plu').length) { new PNotify({ text: 'Item added to order.', width: "220px", delay: 3000, type: 'success' }); } $("#"+ parent_div + " .popup-orig-price").html('0'); option_id = ''; last_io_selected = []; } }).done(function(){ var upsell_item = {}; var c = 0; $("#"+parent_div+" .upsell-item-chkbox:checked").each(function(i){ var upsell_plu = $(this).attr("data-plu"); var upsell_price = $(this).data("price"); upsell_item[c.toString()] = { "PLU": upsell_plu, "price": upsell_price, "qty": 1, "menu_id": menu_id, "is_upsell": true}; c++; }); if(Object.keys(upsell_item).length > 0) { $.ajax({ url: 'core/mybasket.php', type: "POST", data: $.param(upsell_item), success: function (data) { $('#view-basket').html(data); get_cart_total(); if (!$('#free_item_plu').length) { new PNotify({ text: 'Item added to order.', width: "220px", delay: 3000, type: 'success' }); } } }); } }); }); function get_cart_total(){ $('#cartTotal').load("core/ajax/get_cart_total.php?page="+PAGE_NAME, function(data){ $('#cartTotal, .cartTotal').html('$'+data); }); } $("#promo_button").click(function(){ if($("input[name='storestatus']").val()=='offline'){ $.prompt($('#offline-alert-txt').html()); return; } $('#loading_bar').html("<img src='https://d2ova09jg8x3xk.cloudfront.net/citropizza.com.au/images/ajax-loader.gif'>"); $('#loading_bar').center(); var PLU = $(this).find("#add-prompt").attr('ref'); var qty = $("#promo_button #"+PLU+"-qty").val(); var price = $("#promo_button #"+PLU+"-price").val(); // greater than 1 because do not include the 1st item, which is :: Please select :: if($("#promotional_content #"+PLU+"-item-option option").length > 1) { var option_id = $("#promotional_content #"+PLU+"-item-option").val(); } var menu_id = $(this).parents("#menu_items").data('menuid'); $.ajax({ url: 'core/mybasket.php', type: "POST", data: { "PLU":PLU, "qty":qty, "price":price, "option_id":option_id, "menu_id":menu_id }, success: function(data){ $('#loading_bar').html(''); get_cart_total(); $('#view-basket').html(data); last_io_selected = []; if (!$('#free_item_plu').length) { new PNotify({ text: 'Item added to order.', width: "220px", delay: 3000, type: 'success' }); } $("#promotional_container").removeClass('active'); } }); }); $(".add-button").click(function(){ var parent_div = $(this).closest('li').attr('id'); var modal_div = $(this).closest('.modal-popup').attr('id'); if($("input[name='storestatus']").val()=='offline'){ $.prompt($('#offline-alert-txt').html()); return; } $('#loading_bar').html("<img src='https://d2ova09jg8x3xk.cloudfront.net/citropizza.com.au/images/ajax-loader.gif'>"); $('#loading_bar').center(); var PLU = $(this).attr('ref'); if(!isNaN(PLU)){ var qty = $("#"+parent_div).find("#"+PLU+"-qty").val(); var price = $("#"+parent_div).find(+"#"+PLU+"-price").val(); }else{ var qty = $("#"+parent_div+" #"+PLU+"-qty").val(); var price = $("#"+parent_div+" #"+PLU+"-price").val(); } var group_id = $(this).attr('id'); //check if item is from promotional prompt if(qty == null && price == null && $(this).attr('id')=="promo_button"){ var PLU = $(this).find("#add-prompt").attr('ref'); var qty = $("#promo_button #"+PLU+"-qty").val(); var price = $("#promo_button #"+PLU+"-price").val(); // greater than 1 because do not include the 1st item, which is :: Please select :: if($("#promotional_content #"+PLU+"-item-option option").length > 1) { var option_id = $("#promotional_content #"+PLU+"-item-option").val(); } } if($("#"+parent_div+" #menu-"+group_id).length > 0){ price = $("#"+parent_div+" #menu-"+group_id+" option:selected").attr('ref'); qty = $("#"+parent_div+" #qty-"+group_id).val(); if($("#"+parent_div+" .qty-label-popup").length > 0){ // if popup is enabled then we override the qty qty = $("#"+parent_div+" #qty-"+group_id).text(); var price = $("#"+parent_div+' input[name="menu-item-option-radio-'+group_id+'"]:checked').attr('price'); if(qty <= 0){ // if item option only then we override the qty qty = $("#"+parent_div+" #qty-"+PLU).text(); } } } var default_io, mio_ids = [], mio_msg = [], mio_req = 0; if($("#"+ parent_div +" .item-option-radio-list").length > 0){ option_id=""; $("#"+ parent_div +" .item-option-radio-list").each(function(){ if($(this).is(':checked')){ option_id += (option_id.trim()!="") ? "," : ""; option_id += $(this).attr('ref'); } if(typeof $(this).attr('mio-id') !== "undefined" && $(this).attr('mio_id') !== false){ mio_ids.push($(this).attr('mio-id')); } mio_ids = [... new Set(mio_ids)]; }); if(typeof $("#"+ parent_div +" .item-option-radio-list:first").attr('multiple-io') !== "undefined" && $("#"+ parent_div +" .item-option-radio-list:first").attr('multiple-io') !== false){ var io_div; $.each(mio_ids, function(key, val){ var io_selected = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:checked").length, io_min = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").attr('min-io'), io_max = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").attr('max-io'), io_name = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").parent().siblings("div.item-option-group-name-"+val).find("p").text(), io_container = $("#"+parent_div+" .item-option-radio-list[mio-id='"+val+"']:first").parent().parent(); io_container.css({"padding":"", "border":""}); if(io_selected < io_min){ io_container.css({"padding":"5px", "border":"1px solid red"}); mio_req++; if(!io_div){ io_div = io_container; } } else if(io_selected > io_max){ mio_msg.push("'"+io_name+"' can't have more than "+io_max); } }); if(mio_req > 0){ $.prompt("Please complete all sections to finish your order"); $("#"+modal_div+" .modal-body").scrollTop(0).scrollTop(io_div.position().top - 25); $('#loading_bar').html(''); return false; } if(mio_msg.length > 0){ $.prompt(mio_msg.join("<br/>")); $('#loading_bar').html(''); return false; } $("#"+parent_div+" .item-option-radio-list").attr('checked', false); checked_io($("#"+parent_div+" .item-option-radio-list"), "template4"); } else{ default_io = $("#"+ parent_div +" .item-option-radio-list:checked").attr('default-io'); if(typeof default_io !== "undefined" && default_io !== false){ $("#"+parent_div+" input[name=item-option-radio-"+group_id+"][ref='"+default_io+"']").attr('checked', true); } else{ $("#"+parent_div+" input[name=item-option-radio-"+group_id+"]:not(:disabled):first").attr('checked', true); } checked_io($("#"+parent_div+" .item-option-radio-list:checked"), "template4"); } } var menu_id = $(this).parents("#menu_items").data('menuid'); $.ajax({ url: 'core/mybasket.php', type: "POST", data: { "PLU":PLU, "qty":qty, "price":price, "option_id":option_id, "menu_id":menu_id }, success: function(data){ $('#loading_bar').html(''); get_cart_total(); $('#view-basket').html(data); last_io_selected = []; if (!$('#free_item_plu').length) { new PNotify({ text: 'Item added to order.', width: "220px", delay: 3000, type: 'success' }); } $("#promotional_container").removeClass('active'); } }); }); $(".menu-item-option.form-control").change(function () { var parent_div = $(this).closest('li').attr('id'); var group_id = $(this).attr('ref'); var plu = $(this).val(); $("#customise-" + group_id).attr('ref', $(this).val()); $("#" + group_id).attr('ref', $(this).val()); // we hide the custom button when it has hide-custom attr var hide_custom = $("#"+parent_div+" #menu-" + group_id + " option:selected").attr("hide-custom"); if (hide_custom){ $("#"+parent_div+" #menu-"+ group_id).parent().siblings(".item-add-buttons").children(".customise-page").css("display", "none"); }else{ $("#"+parent_div+" #menu-"+ group_id).parent().siblings(".item-add-buttons").children(".customise-page").css("display", "inline-block"); } var io = $(this).closest("li").find(".item-options"); if (io) { var item_option = io.val(); $(io).attr("id",plu+"-option-id"); io.empty(); $.ajax({ url: 'core/ajax/item_options.php', type: "POST", data: { "plu": plu }, dataType: 'json', success: function (data) { // For item-options that was hidden because of no item option on default size // We need to show it else hide if no data was returned if(data.length >= 1 && data){ $(io).fadeIn(0); var io_ids = data.map(function(key, value) { return key["id"]; }); }else{ var io_ids = []; $(io).fadeOut(0); } var option_selected=""; $.each(data, function (key, value) { //console.log(value.option_id); var price_txt = (value.item_price > 0) ? ' - $' + value.item_price : ''; if(item_option != null && io_ids.includes(item_option)){ option_selected = item_option; } else{ if(value.default_item_option_id == value.id){ option_selected = value.id; } } $('<option />', { value: value.id, text: value.item_name + price_txt, ref: value.price }).appendTo(io) }); if(option_selected){ io.val(option_selected); } } }); option_id = $(this).closest("li").find(".item-options").val(); } }); if(check_store_stat() == 'offline') { $("#item-buttons .add-button, .item-add-buttons .customise-add-button").live("click", function(){ if (check_store_stat() == "online") { location.reload(); } }); } //if condition end //end Refresh page function }); //Refresh page function when closing modal OOA-1543 function check_store_stat() { var client_code = $("#client_code").val(); var data_status = ""; $.ajax({ type: 'POST', async: false, url: 'core/ajax/check_store_status.php', data: {client_code: client_code}, success: function(data) { data_status = data; } }); //ajax end return data_status; } //function check_store_stat() end function get_item_option(parent_div){ option_id = ''; if($("#"+ parent_div +" .item-option-radio-list").length){ var popup_price = $("#"+ parent_div +" .active #popup-price").html(); if(popup_price){ popup_price = parseFloat(popup_price.replace('$', '')); var item_option_price = 0; var item_option_ref=""; if($("#"+ parent_div +" .item-option-radio-list:checked").length){ $("#"+ parent_div +" .item-option-radio-list:checked").each(function(){ item_option_price += ($(this).val() !== undefined) ? parseFloat($(this).val()) : 0; item_option_ref += (item_option_ref.trim()!="") ? ", " : ""; item_option_ref += $(this).attr('ref'); }); } var toppings_price = $("#"+ parent_div +" .popup-orig-price").html(); toppings_price = parseFloat(toppings_price.replace('$', '')); var popup_qty = $("#"+ parent_div +" .qty").html(); var toppings_toppings_price = parseFloat(toppings_price) * parseInt(popup_qty); item_option_price = item_option_price * parseInt(popup_qty); popup_price = popup_price * parseInt(popup_qty); var upsell_total = 0; $("#"+parent_div+" .upsell-item-chkbox:checked").each(function(){ var upsell_price = parseFloat($(this).data("price")); upsell_total += upsell_price; }); var total_prices = popup_price + toppings_toppings_price + item_option_price + upsell_total; $("#"+ parent_div + " .popup-item-price").html('$' + total_prices.toFixed(2)); option_id = $("#"+ parent_div +" .item-option-radio-list:checked").attr('ref'); } } }

      Add another line/bar below the sticky navigation menu on yellow background that features a key to the GF, Spice level (Mild, spicy & hot), Vegetarian and Vegan. Icons to be provided as SVG.

    1. Résumé de la vidéo [00:00:27][^1^][1] - [00:27:07][^2^][2]:

      Cette vidéo présente une conférence de Pascal Bressoux sur l'efficacité des pratiques pédagogiques dans l'apprentissage des élèves. Il discute de la causalité en éducation, des spécificités du monde éducatif, et présente deux expérimentations, "Parler" et "ExpiRe", qui illustrent ses principes. Bressoux souligne l'importance de l'enseignement explicite et systématique du langage pour réduire les inégalités éducatives.

      Points forts: + [00:00:27][^3^][3] Introduction et objectifs * Présentation des expérimentations "Parler" et "ExpiRe" * Importance de prouver l'efficacité en éducation * Méthodologie sous-jacente aux expérimentations + [00:02:05][^4^][4] Causalité en éducation * Définition de la causalité et son application en éducation * Analyse de la spécificité et des facteurs contribuant à l'éducation * Discussion sur la généralisation des résultats éducatifs + [00:11:51][^5^][5] Expérimentation "Parler" * Objectifs de l'expérimentation pour améliorer le langage des élèves défavorisés * Méthodes pédagogiques utilisées et leur mise en œuvre * Résultats et impact sur les acquisitions langagières + [00:20:49][^6^][6] Techniques d'enseignement spécifiques * Enseignement explicite du code alphabétique et de la conscience phonique * Exercices de compréhension et de fluence de lecture * Utilisation d'outils d'évaluation pour suivre les progrès des élèves

      Résumé de la vidéo [00:27:10][^1^][1] - [00:55:36][^2^][2] : Cette vidéo présente les résultats d'une étude sur l'efficacité des pratiques pédagogiques sur l'apprentissage des élèves. Elle aborde les méthodes d'évaluation, les comparaisons entre groupes expérimentaux et témoins, et l'impact des dispositifs numériques sur l'enseignement des mathématiques.

      Points forts : + [00:27:10][^3^][3] Évaluation des pratiques pédagogiques * Étude sur l'efficacité des pratiques éducatives * Suivi régulier par des conseillers pédagogiques * Tests individuels pour chaque élève + [00:30:00][^4^][4] Résultats du groupe expérimental * Comparaison avec le groupe témoin et le niveau national * Amélioration significative dans la compréhension de l'écrit * Réduction du pourcentage d'élèves en grande difficulté + [00:40:21][^5^][5] Expérience avec le logiciel Scratch * Utilisation de Scratch pour enseigner les mathématiques * Étude randomisée avec une centaine de classes * Impact sur la division euclidienne, la décomposition additive et les fractions + [00:50:55][^6^][6] Réception par les enseignants et les élèves * Les enseignants trouvent le dispositif acceptable et motivant * Les élèves sont plus motivés avec l'ordinateur qu'avec les séances traditionnelles * Pas de réduction des inégalités scolaires observée

    1. Hippocratic Oath

      The Hippocratic Oath is an ethical code attributed to the ancient Greek physician Hippocrates. It is still used today by medical professionals to swear to practice medicine ethically and honestly.

    1. VS Code をおすすめします。理由はいくつもあります。

      いくつかの理由でVS Codeをおすすめします。

      「for a number of reasons」はいくつもじゃなくて「いくつか」だと思う

    2. Code をクリック

      確認

      直前のカラーテーマの説明と書きっぷりが違うのが気になったけど原文通りなのでママですかね

  3. notebooksharing.space notebooksharing.space
    1. idx_variables.update(index.create_variables(variables))

      index is a tuple, but should be PandasIndex. You've copied too much code over :) Write it from scratch.

    1. A sacramental marriage is the only kind of marriage that can exist between two baptized people. Thus, the Code of Canon Law states that “a valid matrimonial contract cannot exist between the baptized without it being by that fact a sacrament” (can. 1055 §2).
    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors use point light displays to measure biological motion (BM) perception in children (mean = 9 years) with and without ADHD, and relate it to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three BM tasks, but that those tasks loading more heavily on local processing relate to social interaction skills and those loading on global processing relate to age. There are still some elements of the results that are unclear, but nevertheless, the important and solid findings extend our limited knowledge of BM perception in ADHD, as well as biological motion processing mechanisms in general.

      We thank the editors and reviewers for their valuable feedback and constructive comments. In the revised manuscript, we have incorporated all statistics for the models and also provided detailed analytical evidence about the distinct contributions of local and global BM processing. We hope these clarifications could enhance the robustness of our conclusions.

      Public Reviews:

      Reviewer #2 (Public Review):

      Summary:

      Tian et al. aimed to assess differences in biological motion (BM) perception between children with and without ADHD, as well as relationships to indices of social functioning and possible predictors of BM perception (including demographics, reasoning ability and inattention). In their study, children with ADHD showed poorer performance relative to typically developing children in three tasks measuring local, global, and general BM perception. The authors further observed that across the whole sample, performance in all three BM tasks was negatively correlated with scores on the social responsiveness scale (SRS), whereas within groups a significant relationship to SRS scores was only observed in the ADHD group and for the local BM task. Local and global BM perception showed a dissociation in that global BM processing was predicted by age, while local BM perception was not. Finally, general (local & global combined) BM processing was predicted by age and global BM processing, while reasoning ability mediated the effect of inattention on BM processing.

      Strengths:

      Overall, the manuscript is presented in a clear fashion and methods and materials are presented with sufficient detail so the study could be reproduced by independent researchers. The study uses an innovative, albeit not novel, paradigm to investigate two independent processes underlying BM perception. The results are novel and have the potential to have wide-reaching impact on multiple fields.

      We appreciate the your positive feedback very much.

      Weaknesses:

      The manuscript has improved in clarity and conceptual and methodological considerations in response to the last review. However, the reported results still provide incomplete support for the claims the authors make in the paper.

      In relation to other reviewers' earlier comments, the model notation used is still not consistent and model results are reported incompletely, which make it difficult to gain a full picture of the data and how they support the authors' secondary claims. For instance, across the models in the supplementary materials, ß coefficients are only reported selectively which makes it difficult to assess the model as a whole. Furthermore, different terms (task 1, task 2 vs. BM-Local, BM-global) are used to refer to the same levels of a variable, and it is unclear which levels of a dummy variable correspond to which task, making it overall very difficult to comprehend the modelling procedure.

      Thanks for pointing out these issues. In the revised version, we have unified the terminology by consistently referring to task types as BM-Local, BM-Global, BM-General. Additionally, we have provided clarification on the interpretation of dummy variables in relation to model construction. Furthermore, we corrected the model results and included all statistics in Table S1, S2, and S3. For more detailed information, please refer to the response to your Recommendations for the authors.

      Reviewer #3 (Public Review):

      The authors presented point light displays of human walkers to children (mean = 9 years) with and without ADHD to compare their biological motion perception abilities, and relate them to IQ, social responsiveness scale (SRS) scores and age. They report that children with ADHD were worse at all three biological motion tasks, but that those loading more heavily on local processing related to social interaction skills and global processing to age. The valuable and solid findings are informative for understanding this complex condition, as well as biological motion processing mechanisms in general. However, the correlations present a pattern that needs further examination in future studies because many of the differences between correlations are not significant.

      Strengths:

      The authors present differences between ADHD and TD children in biological motion processing, and this question has not received as much attention as equivalent processing capabilities in autism. They use a task that appears well controlled. They raise some interesting mechanistic possibilities for differences in local and global motion processing, which are distinctions worth exploring. The group differences will therefore be of interest to those studying ADHD, as well as other developmental conditions, and those examining biological motion processing mechanisms in general.

      Thanks for this positive assessment of our work.

      Weaknesses:

      The data are not strong enough to support claims about differences between global and lobal processing wrt social communication skills and age. The mechanistic possibilities for why these abilities may dissociate in such a way are interesting, but the crucial tests of differences between correlations do not present a clear picture. Further empirical work would be needed to test this further. Specifics:

      The authors state frequently that it was the local BM task that related to social communication skills (SRS) and not the global tasks. However, the results section shows a correlation between SRS and all three tasks. The only difference is that when looking specifically within the ADHD group, the correlation is only significant for the local task. The supplementary materials demonstrate that tests of differences between correlations present an incomplete picture. Currently they have small samples for correlations, so this is unsurprising.

      We apologize for not clarifying these points earlier. We did identify correlations between performance on all BM tasks and SRS scores. However, it is noteworthy that this finding is not unexpected, given the significant distinctions in SRS scores between TD and ADHD children, alongside their marked differences in all BM tasks. Correlation analyses involving data from both groups may reflect group differences. To elucidate the relationship between social ability impairment and diminished BM processing in children with ADHD, we conducted additional subgroup analyses and found correlations only in the BM-local task. To further support the specificity of this correlation, we compared the differences in coefficients. We revised our modelling procedure for testing differences between correlations in supplementary materials and presented all models statistics in Table S2, S3. Discrepancies in these coefficients, which exclude the influence of differences between groups, suggest that social factors specifically influence the performance of the BM-Local task in children with ADHD. We acknowledge that the analysis for differences between correlations is based on a relative small sample size and provided modest interpretation in discussion. Future studies will aim to increase the sample size to validate our findings.

      Theoretical assumptions. The authors make some statements about local vs global biological motion processing that may have been made in previous studies, but would appear controversial and not definitive. E.g., that local BM processing does not improve with age and is uninfluenced by attention.

      Thanks for your comment. To the best of our knowledge, there have been fewer developmental studies conducted on local BM processing compared to global BM processing. Our study is the first one to directly explore the relationship between local BM processing and age. Additionally, we used QbInattention to evaluate sustained attention function (considered as “top-down” attention) and examined its correlation with local BM processing. Some indirect evidence supported that the ability to process local BM cues remained stable and was unaffected by top-down attention. For example, local BM processing did not show a learning trend (Chang 2009) and was linked to the activation of subcortical regions (Hirai 2020). Research has demonstrated that local BM cues can convey information about walking direction without participants’ explicit attention or recognition (Chang 2009, Hirai 2011, Thompson 2007, Wang 2010), indicating the involvement of “bottom-up” processing (Hirai 2020, Troje 2023). Consistent with previous findings, we did not find significant correlation between local BM processing and age or QbInattention. We acknowledge that the statement such as “local BM processing does not improve with age and is uninfluenced by attention” should be approached with cautions. Therefore, we interpreted our results carefully:

      “Once a living creature is detected, an agent (i.e., is it a human?) can be recognised by a coherent, articulated body structure that is perceptually organised based on its motions (i.e., local BM cues)71. This involves top-down processing and probably requires attention25,72, particularly in the presence of competing information26. Our findings are consistent with those of previous studies on the cortical processing of BM73, as we found that the severity of inattention in children with ADHD was negatively correlated with their performance in global BM processing, whereas this significant correlation was not found in local BM processing, which may involve bottom-up processing61,65 and might not need participants’ explicit attention21,23,74,75. However, further studies are needed to verify this hypothesis.” (lines 461-470)

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Supplementary materials: For all reported results, I suggest the authors use consistent model notation with complete reporting of all statistics in line with common conventions (ideally tables reporting beta values, error terms and confidence intervals for all model predictors, as well as R squared values). In particular the beta values for the reference category are needed to be able to fully interpret the beta values for the reported contrasts.

      We appreciate the your suggestion. In the newly revised manuscript, we reported all statistics including beta values, error terms and confidence intervals for all model predictors, and R squared values. These detailed statistics can be found in Table S1, S2 and S3. We hope this additional information will offer readers a more comprehensive understanding of our study.

      Please also address the following inconsistencies:

      - At least when reporting the model results, the same term should be used when refering to task type (either task 1/2/3/ or local/global/general BM).

      Thank the your for this feedback. We use the same term (BM-Local/Global/General) to refer to task type in the whole text.

      - Second linear model in the Supplementary Materials: The authors state that the results suggest that the correlation between SRS and task 1 is greater than that between task 2 and SRS scores. First of all, to be able to support this claim the authors need to provide the coefficient for task 1 (which, if task 1 is the reference variable should be ß1). Second, as I currently understand the reported model results, the fact that ß4 (representing the difference in relationship to SRS scores between task 2 and task 1; the authors refer to ß3 here although I assume they mean ß4) is negative and shows a trend towards significance would actually mean the relationship between BM processing accuracy and SRS scores is more negative for task 2 relative to task 1 and not, as the authors state, that the correlation with SRS scores is greater for task 1. I realise this contradicts the individual r values and scatter plots and hope the authors can clarify the model results.

      We thank you for pointing out these issues. For the second linear model (Model 4 in revised manuscript), we reported the coefficients for all predictors and model summaries including the coefficient for task 1 (ß1). In addition, we have made correction to the model results. The values of ß4 (representing the difference in relationship to SRS scores between BM-Global and BM-Local) and ß5 (representing the difference in relationship to SRS scores between BM-General and BM-Local) were positive and showed a trend towards significance, indicating that the correlations with SRS total score were more negative for BM-Local relative to BM-Global and BM-General:

      “A general linear model was constructed (Table S2, Model 4): SRS = β0 + β1 * ACC + β2 * D1 + β3 * D2 + β4 * (ACC * D1) + β5 * (ACC * D2). If the effect of the interaction term (i.e., β4 or β5 ) is statistically significant, it indicates a difference in correlations with SRS total score between BM-Local and BM-Global (or BM-General). The results suggested trends where the correlations with SRS total score were more negative for BM-Local relative to BM-Global (standardized β4 \= 0.580 p = 0.074) and BM-General (standardized β5 = 0.550 p = 0.073).” (lines SI 36-42)

      - Third linear model in the Supplementary Materials: In the dummy variable representing task, when local BM is the reference level, which task is represented by d1 and d2, respectively? If I understand the authors' procedure correctly, d1 should represent the difference between local and global BM and d2 the difference between local and general BM. If this is true, ß4 should code for the difference between local and global BM and not, as stated by the authors, for the difference between local and general BM. Also, what is d3?

      Thank you for pointing out this issue. We corrected and clarified the results of third model (Model 5 in revised manuscript) in the revised version and pointed out what is represented by d1 (D1) and d2 (D2), respectively:

      “We recoded task types into two dummy variables, D1 and D2, using BM-Local as a reference. The coefficient of D1 represents the difference in relationship to age between BM-Local and BM-Global, and the coefficient of D2 represents the difference in relationship to age between BM-Local and BM-General. The following model was created for each group (Table S3, Model 5-6): ACC = β0 + β1 * age + β2 * D1 + β3 * D2 + β4 * (age * D1) + β5 * (age * D2). If the effect of the interaction term (i.e., β4 or β5) is statistically significant, it indicates a difference in the effect of age on ACC between BM-Local and BM-Global (or BM-General). In the ADHD group, we observed a significant difference in the effect of age on ACC between BM-Local and BM-General (standardized β5 \= 0.462, p < 0.001) and marginally significant differences in the effect of age on ACC between BM-Local and BM-Global (standardized β4 \= 0.228, p = 0.073).” (lines SI 47-57)

    1. Une fois que vous avez terminé l'exercice, vous pouvez exécuter la commande suivante dans le terminal de VS code  pytest tests.py.

      Ne fonctionne pas...

    2. Téléchargez le fichier de code script_p3c2.py de ce dossier. Lisez-le et vérifiez que vous le comprenez entièrement.

      Attention la 'class' des titres à changer et est devenu : govuk-link

    1. Another limitation involves DITTO speed: DITTO is slower than training-free approaches (prompting)and SFT (15 minutes with DITTO vs. 2 minutes with SFT on 7 demonstrations). A bottleneck lies insampling, though we suspect a mix of prior (e.g., vLLM [ 25]) and future work in LLM inferenceoptimization can improve DITTO’s speed. Finally, DITTO is uninterpretable. It is unclear exactlywhat a model learns after several iterations: do values shift too, or is it just style? We also suspectthat forgetting may affect DITTO. Even with LoRA, models DITTO-ed on writing sometimes refuseto generate code. Related work on overgeneralization might mitigate these effects [40].

      DITTO faces limitations such as biases in GPT evaluations, slower training speed compared to other methods, and unclear learning processes that may lead to forgetting previous knowledge.

    1. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors recorded activity in the posterior parietal cortex (PPC) of monkeys performing a perceptual decision-making task. The monkeys were first shown two choice dots of two different colors. Then, they saw a random dot motion stimulus. They had to learn to categorize the direction of motion as referring to either the right or left dot. However, the rule was based on the color of the dot and not its location. So, the red dot could either be to the right or left, but the rule itself remained the same. It is known from past work that PPC neurons would code the learned categorization. Here, the authors showed that the categorization signal depended on whether the executed saccade was in the same hemifield as the recorded PPC neuron or in the opposite one. That is, if a neuron categorized the two motion directions such that it responded stronger for one than the other, then this differential motion direction coding effect was amplified if the subsequent choice saccade was in the same hemifield. The authors then built a computational RNN to replicate the results and make further tests by simulated "lesions".

      Strengths:

      Linking the results to RNN simulations and simulated lesions.

      Weaknesses:

      Potential interpretational issues due to a lack of evidence on what happens at the time of the saccades.

    1. Une fois que vous avez terminé l'exercice, vous pouvez exécuter la commande suivante dans le terminal de VS code  pytest tests.py.

      Il aurait été bon de faire la même précision qu'au chapitre précédent concernant l'installation de la bibliothèque pytest-mock.

    1. Une fois que vous avez terminé l'exercice, vous pouvez exécuter la commande suivante dans le terminal de VS code  pytest tests.py  .

      Même en utilisant le code donné par la correction, je n'arrive pas à faire fonctionner le programme de tests. Pourriez-vous vérifier le bon fonctionnement ? Par ailleurs, la commande pytest tests.py ne fonctionne sur aucun des programmes de tests. J'utilise la commande python -m pytest tests.py et cela fonctionne sur tous les programmes (fonctionnels...).

    1. CRM Systems: Salesforce.com: Sales Cloud, Service Cloud• Salesforce Technologies: Apex Language, Apex Classes/Controllers, Apex Triggers, Eclipse, SOQL, SOSL, Customobjects, Validation rules, Apex Web Services, Force.com Sites, Lightning Components, Lightning Events.• Tools: Force.com IDE, Force.com Data Loader, Force.com Platform (Sandbox, and Production), Rally, Jira, ANT, Git,Bitbucket, VS code, Jenkins• Languages: Apex, JAVA

    Tags

    Annotators

    1. Web Technologies HTML5, CSS3, JavaScript (ES6), jQuery, Bootstrap ,Angular,React, Node Js,Java , AJAX,JSON, XML, SASS, JS Build and Package management (Bower, Gulp, Grunt, NPM, etc.),RESTFUL SOAP Web Services.Programming Languages C, Core JavaIDE and Graphic Tools Eclipse, Code-Blocks, NetBeans, Visual Studio, Sublime Text, , Fire Bug, ChromeDeveloperWeb/App. Servers Apache Tomcat, WebLogicVersion Control GIT, SVNTesting Unit Testing with Karma and Jasmine

    Tags

    Annotators

    1. Languages Java 1.8, Python, JavaScript, TypeScript, PL/SQL.Technologies Spring, Spring Boot, Spring Batch, Spring Data, Restful, Microservice,Spring MVC, Spring REST, Servlets, JMS, JSP, JSTL, Custom TagsWeb Technologies JavaScript, CSS3, SCSS, Angular, Bootstrap, AJAX, Velocity Templates,HTML5, React JS, and NodeJSJavaScript Technologies Angular 6/7/9/11, React JS, Node JS, Express JS, Ext JS, Backbone JS,Express JS.Frameworks Spring Boot, Spring Framework, Hibernate, Angular Framework, StrutsFramework, Junit, Spring JPA, Spring REST, Spring Web Flux, Spring WebFlow, Spring Security, Hibernate.Database Tools Toad for Oracle, Toad for MySQL, Oracle SQL developer, DB Visualizer,Mongo Compass, PG Admin, Robo Mongo, MySQL Workbench, DBeaverDatabases Oracle 9i/11g/12c, IBM DB2, Mongo Database, MS-SQL Server,PostgreSQL, MySQL, Cassandra, RDS, DynamoDB.Web Services/Specifications SOAP Web Services (JAX-RPC, JAX-WS), RESTful web services (JAX-RS)Web/Application servers Apache Tomcat 8/9, IBM WebSphere, Jetty, WebLogic 10/12, JBoss, NginxCloud Technologies AWS (EC2, S3, SNS, CloudWatch, Cloud Formation Template, RDS, VPC,Auto Scaling, IAM), PCF, DockerVersion Control Git, Tortoise SVN, Bit Bucket, GitHub, CVSIDEs Eclipse, Spring tool suite (STS) IntelliJ, Net beans, JDeveloper, JetBrains,Visual Studio CodeBuild Tools ANT 1.7,1.8,1.9, Maven various versions, Gradle, Ivy, WebpackCI/CD Tools Jenkins, Bamboo, Urban Code Deploy, ConcourseLogging & Monitoring Log4J, SLF4J, Splunk, Zipkins, GrafanaWhite box Testing Junit 3, Junit 4, DB Unit, Mockito, Easy Mock, Power Mock, TestNG,Karma, Protractor, Cucumber, Selenium.Black box Testing HP Quality Center, JIRA, Bugzilla.Performance Testing JMeter, Load UI, Load Runner, WinRunner.ORM Frameworks Hibernate 4, JPA, Spring JPA.Methodologies Agile (XP, Scrum) and SDLC (Waterfall) modelOperating systems Windows 10/7/XP, UNIX, AIX, OEL, Mac, Linux Sun Solaris, Ubuntu Server11/12/14Cloud Technologies AWS (Lambda, EC2, S3, SNS, CloudWatch, RDS, VPC, IAM), Azure

    Tags

    Annotators

    1. Configuration Management AnsibleScripting Language Python Scripting for automation; Bash Shell ScriptingContainerization DockerContinuous Integration (CI) Jenkins, Maven, HelmDocumentation andmanagementConfluence, Jira, Kanban boardVersion Control System (VCS) Git, Github,Artifact AWS S3, NexusMonitoring and alerts AWS CloudWatch, SplunkInfrastructure as Code (IaC) Terraform, AWS CloudFormationContainer Orchestration Kubernetes, AWS EKS, AWS ECS, AWS FargateContainer Registries Docker Hub, AWS ECR (Elastic Container Registry)Web Servers Apache, TomcatSDLC Agile, Scrum,AWS Developers Tools AWS CodeCommit, AWS CodeBuild, AWS CodeDeploy and AWS CodePipelineOperating Systems Unix/Linux, Windows, Windows Server2012, and 2016Linux Distributions Centos 6, 7 & 8, RHEL 6, 7 & 8, UbuntuVirtualization Platforms Oracle Virtual Box, VMware Workstation,Amazon Web Services VPC, EC2, S3, IAM,SNS,ELB, Auto Scaling, Route 53, Lambda; Elastic Beanstalk, EFS, EBS,CloudTrail, Trusted Advisor, AWS Organizations, CloudFront, WAFNetworking/ Protocols TCP/IP, FTP,SCP, SSH, SSL, DNS, HTTP, HTTPS, DHCP, VPN, and LDAPDatabase MySQL, RDS, DynamoDBTeam Communication Tools Slack, Microsoft Teams, Skype for Business, MattermostWeb Development HTML, CCSLanguages English and French

    Tags

    Annotators

    1. SDLC Methodologies Waterfall, Agile/ ScrumProgramming Languages C, C#.NET, ASP.NET, VB.NET, T-SQL, PL/SQL,.NET COREsWeb Technologies ASP.Net Core,ASP.NET, CSS, RAZOR, HTML, XHTML, XML,AJAX, Angular JS, Angular2/4/6/7/8/9/10, React JS, Node JSDesign Patterns MVC, MVVMScripting Languages JavaScript, JQUERYRPA UiPath, Blue Prism and Automation AnywhereJS Frameworks Angular 7/8/9, React JS, Node JSIDE Visual StudioFrameworks MS Visual Studio 2017/2015/2012/2010/2008/2005, .NET Framework,Twitter bootstrap, MVC 2.0/3.0/4.0/5.0, Visual Studio Code,.Net Core2.2O/R Mapping LINQ, Entity FrameworkServer IISDatabases MS SQL Server, MS Access, Oracle, DB2Version Control GitHub, Team foundation Server (TFS), Visual source safe (VSS), SVNBusiness Modeling Tools Rational Rose, MS-Visio, MS PowerPoint, Microsoft Office SuiteReporting Tools MS-SQL Server Reporting Services (SSRS), Crystal ReportMessage Brokers RabbitMQ

    Tags

    Annotators

    1. Programming LanguagesObjective-C, Swift, Java, Kotlin, C#, Assembly, SQL, VB, HTML,CSS, JavaScript.Web ServicesRESTful, SOAP, JSON, XML.QualityContinuous Integration, CI/CD, Unit Test, Functional Test,Scenario Test, Automated Testing.Testing ToolsXCTest, Travis, TestFlight, Instruments, Allocations.Design Patterns/ArchitecturesMVC, MVP, MVVM.DatabasesSQLite, SQL, MySQL, Oracle, Firebase.IDEsXcode, Visual Studio, Eclipse, Code Blocks, NetBeans,Android Studio, Genexus.Project Methodology and ToolsAgile, PSP/TSP, Scrum, JIRA.

    Tags

    Annotators

    1. Programming LanguagesSwiftUI, Swift, Objective-C, Java, Kotlin, C#, Assembly, SQL,VB, HTML, CSS, JavaScript.Web ServicesRESTful, SOAP, JSON, XML.QualityContinuous Integration, CI/CD, Unit Test, Functional Test,Scenario Test, Automated Testing.Testing ToolsXCTest, Bitrise, TestFlight and Instruments.Design Patterns/ArchitecturesMVC, MVVMDatabasesSQLite, GRDB, SQL, MySQL, Oracle, Firebase.IDEsXCode, Visual Studio, Eclipse, Code Blocks, NetBeans,Android Studio, Genexus.Project Methodology and ToolsAgile, PSP/TSP, Scrum, JIRA, and Confluence.iOS Development

    Tags

    Annotators

    1. デフォルトのPythonに

      ここからの4行の訳の代案を考えてみました。

      原文 Python doesn’t include pytest by default, so you’ll learn to install external libraries. Knowing how to install external libraries will make a wide variety of well-designed code available to you. These libraries will expand the kinds of projects you can work on immensely.

      代案

      pytestはPythonにはデフォルトでは含まれていないので、外部ライブラリのインストール方法についても学びます。外部ライブラリをインストールするということは、さまざまな洗練されたコードが手に入るということです。これらのライブラリによって取り組めるプロジェクトの種類が格段に増えます。

      代案の補足

      Knowing how to~を「~方法を知ることで」と書くと翻訳調になるので、方法を知る=できると意訳しました。 また、取り組めるthe kinds of projectがすごく拡張する、ということで、「できることを大きく拡張」のところを「種類が格段に増える」に変えました。

    1. Creating images and Deploying applications like: .Net, Java, NodeJS, Python, React ...• Stream-proceessing – RabbitMQ, Kafka (building from scratch, setup)• Migration On-Premises - Cloud• Continuous Integrations – Jenkins (Groovy), TeamCity, GitLab, Harness.io, ConcorseCI, TektonMaven, Gradle, NPM (C++, nodejs, C#, Python, Go, Java pipelines)• Binary Repository Managers – JfrogArtifactory, Nexus• Web Servers – Nginx, Apache, Tomcat, JBoss, WildFly• Scripting – Bash, Python(numpy, pandas, sklearn, dash, dask, flask, boto, etc), PowerShell• Monitoring – DataDog, Zabbix, ELK, SPLUNK, CloudWatch, AppMonitoring – Dynatrace• Logging/Monitoring – ElasticSearch/OpenSearch, Splunk, New Relic, Apigee• Source code management – GIT, BitBucket, GitHub, GitHub Enterprise Cloud - end to end, GitLab• Salesforce – Administration and Development – Provision Production, Sandboxes (Dev, DevPro,ParitalCopy, Full, Templates), Manual Provisioning (Salesforce Setup), Metadata Deployment Tools(ChangeSets, CLI, MetadataAPI), SalesforceDX, VSCode, Terraform provisioning, Incrementaldeployments(SalesforceDX/MetadataAPI package.xml force:source:deploy/force:mdapi:deploy),force.com, welkin suite build applications Apex, Visualforce, testing, troubleshooting, evaluation (Sales,Marketing Cloud, Lightning, WebComponents), REST, Tooling, Metadata API, Workflow Configuration,Process Builser, Validation rules, Formulas, Security (roles, profiles, permission sets)• Operating Systems – MacOS, iOS, Linux (Redhat, CentOS, Ubuntu, Debian), Windows.• Containerization + Orchestration – Docker, Docker-Compose, AWS EKS, Kubernetes / Openshift• Databases – MySQL, DynamoDB, Aurora, Redis, Redshift, MS SQL, Snowflake, Grafana, Prometheus,Visualization, Bigdata, BigQuery.• Data Visualization - Tableau• Testing Automation – Selenium, SlimerJS, PhantomJS, CasperJS
    2. AWS Cloud infrastructure – EC2, S3, RDS, ECS, EKS, AppRunner, Route53, ELB, Auto Scaling,Lambda, StepFunctions, CloudWatch, CodeArtifact, IAM, API Gateway, AWS CLI Automation, ),PySpark, Glue, Athena, Data Pipeline, Data Exchange, Lake Formation (AWS Certified)• Azure Cloud Infrastructure, VM, Storage, Databases, Networking, AutoDeploy, PowerShell Azure CLIAutomation, AzureDevOps, AKS• GCP Cloud Infrastructure – Compute, Storages, Databases, Networking, IAM, CloudFunctions, Logging,Pub/Sub, CloudBuild, BigQuery, CloudSheduler, GKE• Microservices / Kubernetes / Openshift - Design, Deploy, Administration, Scaling, Troubleshooting, Vault, ConsulDeployments to K8s – writing deployment.yaml files, writing Helm charts.• Configuration management – Ansible/ Ansible Tower / Salt• Data Engineering, Design and Analisys, ETL – Design Data Flow, Creation of Data Pipelines, ETLDevelopment, Data Migration, Conversion, Cleansing, Sanitization• Databrics – Data Science, Analytics, ETL, Machine Learning, Visualizations• Python Development - console applications for the different tasks, e.g. sync data across multipledatabases, web scrapping, aws/gcp/twilio/..., web-apps with FlaskPytest - cover business logic in the applicationsUsed Python modules: Matplotlib, Searborn, requests, Beautiful Soup, numpy, scipy, matplotlib, python-twitter, Pandas data frame, network, urllib2, urllib3, NLTK, pillow, pytest, gradio, TensorFlow, SciPy,Statmodels, Keras, MySQLMachineLearning – TensorFlow, Scikit, Pytorch, Keras, Rapid Miner, SparkMLib• Go Development – automations, dockerized microservices, ETL Jobs, backgroud jobs, monitoring.• .Net Development – Asp.Net Core, Web API, Blazor, Autofac, Mediatr, Hangfire, XUnit, NUnit,NSubstitute. Creation and deployment of the dockerized microservices.• Apache Kafka - design, build, and maintenance of scalable clusters, develop and implement strategiesfor data ingestion, processing, and storage, monitoring and optimization, troubleshooting• Hadoop clusters with CDH4.4 on CentOS – Provisioning, Setup, Maintenance, Troubleshooting• Infrastructure as Code (AWS, GCP, AZ) – Terraform / Terragrunt / Atlantis / Pulumi - fromscratch/modifying and developing existing / writing modules / using public modules / with or without TFE,troubleshooting, fixing issues, upgrading to new version, “state file surgery”• Code quality and Security Control – SonarQube, Veracode, CodeScan, CodeInspector

    Tags

    Annotators

    1. Software/tools not mentioned in limited job sampling above:Docker. Classic ASP master. Native Andr0id and i0S development since 2012 using B4X, B4A, and B4i. REST/RESTful API. JSON API.Databases:Pro with PostgreSQL, MySQL, SQL Server, and MS Access: design, setup, and use of relational databases (RDBMs), writing SQL queries,and creation of [stored] procedures (not expert). No M0ngo or other N0SQL yet.Miscellaneous, front-end, and older:Agile Scrum with Jira, Visual Studio Code (VS Code) with Git (GitHub). MS Teams and Confluence. Setting up and using Oracle VMVirtualBox, Hyper-V. Postman for GET/POST and automation of API. B4x, and B4a for native Andr0id dev. B4i for native i0S m0biledev. FTP/SFTP. Microsoft Visual Studio, and Interdev since v1.0 but haven’t used in about 10 years. Classic ASP w/VBscript expert;still maintaining the CMS I built in that language, pre-WordPress times. VB.net rusty. Experience manipulating Excel from VB.Net butnot looking for .Net job. VBA in MS Access and Excel since 1994. Google Sheets scripting novice. IIS expert. Lone, team, and pairedprogramming. Visual Basic expert since version 1.0. HTML. HTML5. XML. CSS, Bootstrap, and Flexbox by hand for adaptive andresponsive web pages. Photoshop. Affinity Photo, Visio. Asana. Slack. Technical and creative writing expert; authored hundreds ofcoding tutorials and articles. Experienced with video editing using Davinci Resolve and other editors. 2D and 3D animation using Poser,Bryce, Daz3D, Adobe Animate, Flash, and more.

    Tags

    Annotators

    1. Big Data Eco System HDFS, Spark, MapReduce, Hive, Pig, Sqoop, Flume, HBase, Kafka Connect,Impala, Stream sets, Oozie, Airflow, Zookeeper, Amazon Web Services.Hadoop Distributions Apache Hadoop 1x/2x, Cloudera CDP, Hortonworks HDPLanguages Python, Scala, Java, Pig Latin, HiveQL, Shell Scripting.Software Methodologies Agile, SDLC Waterfall.Databases MySQL, Oracle, DB2, PostgreSQL, DynamoDB, MS SQL SERVER, Snowflake.NoSQL HBase, MongoDB, Cassandra.ETL/BI Power BI, Tableau, Informatica. sVersion control GIT, SVN, Bitbucket.Operating Systems Windows (XP/7/8/10), Linux (Unix, Ubuntu), Mac OS.Cloud TechnologiesAmazon Web Services, EC2, S3, SQS, SNS, Lambda, EMR, Code Build, CloudWatch.Azure HDInsight (Databricks, Data Lake, Blob Storage, Data Factory, SQL DB, SQLDWH, Cosmos DB, Azure DevOps, Active Directory).

    Tags

    Annotators

    1. ▪ Cloud Platforms: Azure,▪ Scripting: JSON, YAML, Shell Scripting▪ Operating System: Windows, Linux (CentOS, RedHat, Ubuntu)▪ Version Control Systems: GitHub, Azure Repo▪ Networking: VPN, Load Balancing, Reverse Proxy, Firewalls▪ Application Monitoring: SiteScope, Prometheus, Grafana, Azure Monitor, Real User Monitor▪ Infrastructure as Code: Terraform, Ansible▪ HDFS – Yarn, Kafka, Zookeeper, Spark,▪ Databases: MySQL, SQL Server, HBase, Cassandra DB▪ CI/CD: Jenkins, GitOps, Argo CD, Azure pipelines▪ Containers: Docker, Kubernetes

    Tags

    Annotators

    1. Operating Systems: Redhat Enterprise Linux, Ubuntu, Alpine, CentOS• Configuration Management and IaaS: Puppet, Terraform, Satellite,Ansible• Cloud Environment: AWS, GCP• Monitoring: Nagios, Splunk, Vistara, Prometheus, Grafana, Loki, Lens• Database: Hbase, Mysql, Postgres• Backup: NetApp• Microservices and PaaS: Kubernetes, Docker Compose, Openshift,EKS, ECS, RKE• Virtualization: Vmware Esxi and Vspere Client, KVM, Virtual Box• Container: Docker, Vagrant• SCM: Git & GitHub & Gitlab, Bit Bucket• Storage: S3• Ticketing: Bugzilla, Remedy, Vistara, Help Desk• Scripting/Code Development: Python, Ruby, Perl, Bash• Others: Disa STIGS, LAMP, Apache, Tomcat, Jenkins• GitOps: ArgoCD• CI/CD: Jenkins, ArgoCD

    Tags

    Annotators

    1. Language RobotFramework, Python, Java, AppleScript, Perl Script, Shellscripting, Batch Script, Hive, Impala, SQL, AutoIT, Swing, C++Testing Tools Zephyr, HP Quality Center and ALM 11.52, Selenium, TestNG,Keyword-driven, ATDD, TDD, PytestIDE Tools Visual Studio Code, PyCharm, Eclipse, Visual Studio, Apple ScriptEditor, AutoIt Script Editor

    Tags

    Annotators

    1. Linux, Apache, AngularJS, Jenkins, Linux, C++, BASIC, Pascal, MongoDB, Foxpro, dBase,Paradox, Access, DataEase, MySQL, Oracle, Sybase, Watcom, DB2, Informix, JSON, XML, T-SQL, JavaScript,ITIL, DevSecOps, Agile, Scrum, CMMi, Infrastructure as Code (IaC), Blue/Green deployments, Infrastructure as aService (IaaS), Platform as a Service (PaaS), virtual networks, virtual machines, cloud services, big data analyticsthat leverage threaded or parallel processing (such as Hadoop), predictive analytics systems & Artificial Intelligence(AI), data visualization tools (such as Power BI, Tableau, and Tibco), Worked with Java, Oracle, SQL Server, Sybase,PL/SQL, Toad, PowerBuilder, VB.Net, C#, ASP.Net, Visual Basic, ASP, T-SQL, Novell NetWare, Borland Delphi,Paradox, Access, and other tools, etc

    Tags

    Annotators

    1. SECURITY INFORMATION ANDEVENT MANAGEMENT (SIEM)• WIRESHARK• LOG ANALYSIS• SECURITY AUDIT• TECHNICAL WRITING• SECURITY POSTURE ASSESSMENT• STAKEHOLDER MANAGEMENT• PROJECT MANAGEMENTMETHODOLOGIES, SUCH AS AGILEAND WATERFALL• SECURITY CODE ANALYSIS• THREAT MODELING• FIREWALL• SECURITY STANDARDS• IDS• SECURITY CONTROL ASSESSMENT• STRONG ANALYTICAL ANDPROBLEM-SOLVING ABILITIES• RISK ASSESSMENT AND MITIGATION• FIREWALL AND INTRUSIONDETECTION SYSTEMS

    Tags

    Annotators

    1. Microsoft SharePoint Server• ASP.NET MVC , Entity Framework, ADO.Net, XML Web Services,Web API, Angular, JQuery, JavaScript, XML, AJAX, Transact-SQL,HTML5, XML, CSS, Microsoft Azure Data Studio 1.3, Microsoft SQLAzure 12, and Microsoft SQL Server Management Studio 18, Toad,Microsoft Project• Project Management and SAFe Lean Agile methodologies• Experience in building and consuming Web services, WCF andREST API Services• Experience with GIT for Code Repository, Visual SourceSafe, SVN,and Microsoft Team Foundation Server• Expertise in Database Design and Database Programming usingSQL Server 7.0 2000/2005/2008/2012/2019 and Oracle. ETL &Big Data. Pentaho Kettle Solutions• MySQL, Sybase, Transact-SQL, Microsoft SQL Server, SSIS, SSRS,SSAS, SSMS, SSMA, High Availability, Big Data• ODBC, LINQ, OLE DB, ADO, XML, XHTML, XSLT, CSS, HTML5 ,REST, JSON , AJAX, RAZOR, XML Web Services, SOAP, WSDL,WADL, ASP.NET, MVC, OWIN,• VBScript, JavaScript, TypeScript, JQuery, Angular, PHP, ObjectPascal, Delphi• Object PAL, Erwin, Paradox, DAX, XBase, dBase, Microsoft VisualFoxPro 9.0, FoxPro for Macintosh, Microsoft Access, Toad• Microsoft Office 365, Power BI• Windows NT/Windows, DevOps, PaaS, IaaS, Server/WindowsCE/Windows Mobile, Social Authentication

    Tags

    Annotators

    1. pen-source software (F

      "A natural initial question is what is open source software? Roughly, being open source requires that the source code, and not only the object code (the sequence of 1's and 0's that computers actually use), be made available to everyone, and that the modifications made by its users also be turned back to the community."(Lerner & Tirole, 2001).

      Lerner, J., & Tirole, J. (2001). The open source movement: Key research questions. European economic review, 45(4-6), 819-826.

      https://hypothes.is/groups/x4RQA5XX/edci-338-a01-summer-2024

    1. Some more of my recent learning with devcontainer.json (its Dev Container metadata):

      • Interactive commands (those waiting for user input like read) do not display the input request in (at least onCreateCommand and postCreateCommand sections), so it is better to keep them in updateContentCommand or postAttachCommand.
      • If there are 2 read commands in a single section, like updateContentCommand, only the 1st one is displayed to the user, and the 2nd one is ignored.
      • When I put a read command within a dictionary (with at lest 2 key/values) of postAttachCommand, the interactive command wasn't being displayed.
      • We need to use /bin/bash -c to be able to use read -s (the -s flag) which allows for securely passing the password so that it does not stay in the VS Code console. Also, I had trouble with interactive commands and if statements without it.
      • Using "GITLAB_TOKEN": "${localEnv:GITLAB_TOKEN}" does not easily work as it is looking for GITLAB_TOKEN env variable set locally on our host computers, and I believe no one does it.
      • The dictionary seems to be executing its scripts in parallel; therefore, it is not easily possible to break down long lines which have to execute in a chronological sequence.
      • JSON does not allow for human-readable line breaks; therefore, indeed, it seems impossible to improve the long one-liners.
      • The files/folders mentioned within mounts need to exist locally (otherwise, Docker container build fails). They are mounted before any other section. Technically, we can protect ourselves with the following command to find an extra message in VS Code container logs:

      json "initializeCommand": "/bin/bash -c '[[ -d ${HOME}/.aws ]] || { echo \"Error: ${HOME}/.aws directory not found.\"; exit 1; }; [[ -f ${HOME}/.netrc ]] || { echo \"Error: ${HOME}/.netrc file not found.\"; exit 1; }; [[ -d ${HOME}/.ssh ]] || { echo \"Error: ${HOME}/.ssh directory not found.\"; exit 1; }'",

      Other option is to get rid of the error completely, but this creates files on the host machine; therefore, it is not an ideal solution:

      json "initializeCommand": "mkdir -p ~/.ssh ~/.aws && touch ~/.netrc",

    1. Although many projects and ideas share Elinor Ostrom's personal, cooperative and Earth-helping significance, they lack the chain reaction that keeps them going.

      for - quote - chain reaction - why good projects fail - (see below)

      • Although many projects and ideas share Elinor Ostrom's personal, cooperative and Earth-helping significance,
        • they lack the chain reaction that keeps them going.
      • On the contrary, this flame is extinguished by
        • the direct action (fakes), or
        • indirect action (ignoring or taking our attention elsewhere)
      • of the mainstream media that in a certain sense has lost that "code of ethics" of journalism that upheld values; such as
        • truthfulness,
        • independence,
        • objectivity,
        • fairness,
        • accuracy,
        • respect for others,
        • public accountability...
    1. Containerization: Docker, ECS, Kubernetes, EKSDevOps/Tools: Git, GitHub, Bitbucket, Jenkins, Bamboo, SourceTree, Maven, Ansible (YAML playbooks, AnsibleTower), Chef, CloudWatch, Route53, EC2, ELB, Auto Scaling, RDS, IAM, S3, Fargate, CloudFront, Lambda, CodeCommit,Code Build, Code Pipeline, VPC, Infrastructure as code Terraform, Cloud FormationScripting/Programming: Python, Shell, Perl, Ruby, SOL, PHP, HTMLDatabases: MySQL, SQL Server, PostgreSQL, MongoDBWeb/App Servers: Apache Tomcat, Nginx, Web LogicBug Tracking Tools: JIRA, Rally, Confluence, Service NowMonitoring tools: Nagios, Cloud Watch, Grafana, Prometheus, ELK, Splunk

    Tags

    Annotators

    1. Annotators are warned repeatedly not to tell anyone about their jobs, not even their friends and co-workers, but corporate aliases, project code names, and, crucially, the extreme division of labor ensure they don’t have enough information about them to talk even if they wanted to

      I find this interesting that they can't even tell friends about what their job includes. I guess if you are involved in billon dollar companies you can't give any information that could be told to a competitor.

    1. Environment, Operating Systems: Visual Studio 2003, 2005, 2008/2010 Windows Server 2000/2003/2008 & Windows 71.1/2.0/3.0/3.5, WCF• MS.Net Framework : C#.NET, VC++.NET, VB.NET, ASP.NET, VTK HTML/DHTML, XML• Web Tools Source: Team Foundation Server 2010, Visual Source Safe & PVCS• RDBMS: SQL Server 2000/2005/2008, Oracle 11g/10g/9i• Methodologies Other Tools: N-Tier architecture, OOP Concepts, Complete SDLC Microsoft Visio, MS Office, QTP(Quick TesT Professional)• Configuration Management Tools : Ansible, Maven, Kubernetes, Docker, Splunk• Cloud Platform : Amazon Web Services EC2, Simple storage Service(S3),RDS, Cloud Trail, CloudFront, Microsoft Azure DevOps, Microsoft, Storage Accounts, Azure Repos• Version Control : GIT, BitBucket, Code Commit, Subversion(SVN)• Issue Tracking Tools :JIRA, ServiceNow, Azure Boards• Web Servers :Apache, Web logic, WebSphere 7.0,8.5,8.5.5• Programming Languages :Java script ,Python, UNIX• OS and Other Tools :Windows All versions, Skype, JIRA, Confluence, UNIX, Linux• Monitoring Tools :Splunk, Dynatrace.

    Tags

    Annotators

    1. Note: In this example, after crossover and mutation, the least fit individual is replaced from the new fittest offspring.

      The code gets cut out by freedium, so here is all the code after doing a ctrl-a then ctrl-c:

      ``` import java.util.Random;

      // Main class public class SimpleDemoGA {

      Population population = new Population();
      Individual fittest;
      Individual secondFittest;
      int generationCount = 0;
      
      public static void main(String[] args) {
      
          Random rn = new Random();
      
          SimpleDemoGA demo = new SimpleDemoGA();
      
          //Initialize population
          demo.population.initializePopulation(10);
      
          //Calculate fitness of each individual
          demo.population.calculateFitness();
      
          System.out.println("Generation: " + demo.generationCount + " Fittest: " + demo.population.fittest);
      
          //While population gets an individual with maximum fitness
          while (demo.population.fittest < 5) {
              ++demo.generationCount;
      
              //Do selection
              demo.selection();
      
              //Do crossover
              demo.crossover();
      
              //Do mutation under a random probability
              if (rn.nextInt()%7 < 5) {
                  demo.mutation();
              }
      
              //Add fittest offspring to population
              demo.addFittestOffspring();
      
              //Calculate new fitness value
              demo.population.calculateFitness();
      
              System.out.println("Generation: " + demo.generationCount + " Fittest: " + demo.population.fittest);
          }
      
          System.out.println("\nSolution found in generation " + demo.generationCount);
          System.out.println("Fitness: "+demo.population.getFittest().fitness);
          System.out.print("Genes: ");
          for (int i = 0; i < 5; i++) {
              System.out.print(demo.population.getFittest().genes[i]);
          }
      
          System.out.println("");
      
      }
      
      //Selection
      void selection() {
      
          //Select the most fittest individual
          fittest = population.getFittest();
      
          //Select the second most fittest individual
          secondFittest = population.getSecondFittest();
      }
      
      //Crossover
      void crossover() {
          Random rn = new Random();
      
          //Select a random crossover point
          int crossOverPoint = rn.nextInt(population.individuals[0].geneLength);
      
          //Swap values among parents
          for (int i = 0; i < crossOverPoint; i++) {
              int temp = fittest.genes[i];
              fittest.genes[i] = secondFittest.genes[i];
              secondFittest.genes[i] = temp;
      
          }
      
      }
      
      //Mutation
      void mutation() {
          Random rn = new Random();
      
          //Select a random mutation point
          int mutationPoint = rn.nextInt(population.individuals[0].geneLength);
      
          //Flip values at the mutation point
          if (fittest.genes[mutationPoint] == 0) {
              fittest.genes[mutationPoint] = 1;
          } else {
              fittest.genes[mutationPoint] = 0;
          }
      
          mutationPoint = rn.nextInt(population.individuals[0].geneLength);
      
          if (secondFittest.genes[mutationPoint] == 0) {
              secondFittest.genes[mutationPoint] = 1;
          } else {
              secondFittest.genes[mutationPoint] = 0;
          }
      }
      
      //Get fittest offspring
      Individual getFittestOffspring() {
          if (fittest.fitness > secondFittest.fitness) {
              return fittest;
          }
          return secondFittest;
      }
      
      
      //Replace least fittest individual from most fittest offspring
      void addFittestOffspring() {
      
          //Update fitness values of offspring
          fittest.calcFitness();
          secondFittest.calcFitness();
      
          //Get index of least fit individual
          int leastFittestIndex = population.getLeastFittestIndex();
      
          //Replace least fittest individual from most fittest offspring
          population.individuals[leastFittestIndex] = getFittestOffspring();
      }
      

      }

      //Individual class class Individual {

      int fitness = 0;
      int[] genes = new int[5];
      int geneLength = 5;
      
      public Individual() {
          Random rn = new Random();
      
          //Set genes randomly for each individual
          for (int i = 0; i < genes.length; i++) {
              genes[i] = Math.abs(rn.nextInt() % 2);
          }
      
          fitness = 0;
      }
      
      //Calculate fitness
      public void calcFitness() {
      
          fitness = 0;
          for (int i = 0; i < 5; i++) {
              if (genes[i] == 1) {
                  ++fitness;
              }
          }
      }
      

      }

      //Population class class Population {

      int popSize = 10;
      Individual[] individuals = new Individual[10];
      int fittest = 0;
      
      //Initialize population
      public void initializePopulation(int size) {
          for (int i = 0; i < individuals.length; i++) {
              individuals[i] = new Individual();
          }
      }
      
      //Get the fittest individual
      public Individual getFittest() {
          int maxFit = Integer.MIN_VALUE;
          int maxFitIndex = 0;
          for (int i = 0; i < individuals.length; i++) {
              if (maxFit <= individuals[i].fitness) {
                  maxFit = individuals[i].fitness;
                  maxFitIndex = i;
              }
          }
          fittest = individuals[maxFitIndex].fitness;
          return individuals[maxFitIndex];
      }
      
      //Get the second most fittest individual
      public Individual getSecondFittest() {
          int maxFit1 = 0;
          int maxFit2 = 0;
          for (int i = 0; i < individuals.length; i++) {
              if (individuals[i].fitness > individuals[maxFit1].fitness) {
                  maxFit2 = maxFit1;
                  maxFit1 = i;
              } else if (individuals[i].fitness > individuals[maxFit2].fitness) {
                  maxFit2 = i;
              }
          }
          return individuals[maxFit2];
      }
      
      //Get index of least fittest individual
      public int getLeastFittestIndex() {
          int minFitVal = Integer.MAX_VALUE;
          int minFitIndex = 0;
          for (int i = 0; i < individuals.length; i++) {
              if (minFitVal >= individuals[i].fitness) {
                  minFitVal = individuals[i].fitness;
                  minFitIndex = i;
              }
          }
          return minFitIndex;
      }
      
      //Calculate fitness of each individual
      public void calculateFitness() {
      
          for (int i = 0; i < individuals.length; i++) {
              individuals[i].calcFitness();
          }
          getFittest();
      }
      

      } ```

    1. there is no good solution to the object/relational mapping problem.

      we started out with a concept of relation that was not mathematical but conceptual associations with explicitly stated meaningful names and ways or traversing and interpreting the information in specific intended ways.

      Once you had associative complexes accessibale you can always provide alternative interpretations and in fact adjust the structure as needed, refactor and create new associative complexes that treated info structures in a situated whole together with the means of presenting, reasoning, processing interpreting etc

      Before COD's 'relational' model came information was processed and retrived via traversing meaningful associations We had NoSQL that is to say we did not just have constratints but the processes correspongind to situational intent could be invoked on the fly with the ability to check for constraine

      Imposing constratint at write were necessary so taht memory/storga locations can be resued

      No when data= infomrations/code can be stored immutably and retrieved by contant addresses and paths we can have true object model without trying to do the impossible and map the relational model into the object model. It is impossible because it imposes a separation of and enclusure of information from the means of presenting and morphing it whre these two should co-evolve

    1. Federal Regulation §602.17: Application of Standards in Reaching Accreditation Decisions requires that all public universities have processes in place through which the institution establishes that a student who registers in any course offered via distance education or correspondence is the same student who academically engages in the course or program; and makes clear in writing that institutions must use processes that protect student privacy and notify students of any projected additional student charges associated with the verification of student identity at the time of registration or enrollment. Please see the Electronic Code Federal Regulations for more information.

      regulation about identify verification of students in Online courses

    1. eLife assessment

      This important manuscript uses a machine-learning approach to predict and annotate cis-regulatory elements across insect genomes, helping to address a much-needed gap in comparative genomics. This method does not rely on sequence alignments, thereby allowing functional genomics studies of more distant species, including emerging model organisms. There are nuanced views on the strength of the evidence from the predictions: the pipeline appears to be based on solid evidence, but the methods could be better described. We suggest the manuscript would be much more robust if the code used was accessible for review and validated further.

    2. Reviewer #2 (Public Review):

      Summary:

      The ability of researchers to identify and compare enhancers across different species is an important facet of understanding gene regulation across development and evolution. Many traditional methods of enhancer identification involve sequence alignments and manual annotations, limiting the ability to expand the scope of regulatory investigations into many species. In order to overcome this obstacle, the authors apply a previously published machine learning method called SCRMshaw to predict enhancers across 33 insect species, using D. melanogaster as a reference. SCRMshaw operates through the selection of a few dozen training loci in a reference genome, marking genomic loci in other species that are significantly enriched with similar k-mer distributions relative to randomly selected genomic backgrounds. Upon identification of predicted enhancer regions, the authors perform post-processing step filtering and identify the most likely predicted enhancer candidates based on the proximity of an orthologous target gene. They then perform reporter gene analysis to validate selected predicted enhancers from other species in D. melanogaster. The analysis of the expression patterns returned variable results across the selected predicted regions.

      Strengths:

      The authors provide annotations of predicted regions across dozens of insect species, with the intention of expanding and refining the annotations for use by the scientific field. This is useful, as researchers will be able to use the identified annotations for their own work or as a benchmark for future methods. This work also showcases the flexible and versatile nature of SCRMshaw, which can readily obtain predictions using training sets of genomic loci requiring only a few dozen annotations as input. SCRMshaw does not require sequence alignments of the enhancers and can operate without prior knowledge of the cis-regulatory sequence rules such as transcription factor binding motifs, making it a useful tool to explore the evolution of enhancers in further distant and less well-studied species.

      Weaknesses:

      This work provides predicted enhancer annotations across many insect species, with reporter gene analysis being conducted on selected regions to test the predictions. However, the code for the SCRMshaw analysis pipeline used in this work is not made available, making reproducibility of this work difficult. Additionally, while the authors claim the predicted enhancers are available within the REDfly database, the predicted enhancer coordinates are currently not downloadable as Supplementary Material or from a linked resource.

      The authors do not validate or benchmark the application of SCRMshaw against other published methods, nor do they seek to apply SCRMshaw under a variety of conditions to confirm the robustness of the returned predicted enhancers across species. Since SCRMshaw relies on an established k-mer enrichment of the training loci, its performance is presumably highly sensitive to the selection of training regions as well as the statistical power of the given k-mer counts. The authors do not justify their selection of training regions by which they perform predictions.

      While there is an attempt made to report and validate the annotated predicted enhancers using previously published data and tools, the validation lacks the depth to conclude with confidence that the predicted set of regions across each species is of high quality. In vivo, reporter assays were conducted to anecdotally confirm the validity of a few selected regions experimentally, but even these results are difficult to interpret. There is no large-scale attempt to assess the conservation of enhancer function across all annotated species.

      Lastly, it is suggested that predicted regions are derived from the shared presence of sequence features such as transcription factor binding motifs, detected through k-mer enrichment via SCRMshaw. This assumption has not been examined, although there are public motif discovery tools that would be appropriate to discover whether SCRMshaw is assigning predicted regions based on previously understood motif grammar, or due to other sequence patterns captured by k-mer count distributions. Understanding the sequence-derived nature of what drives predictions is within the scope of this work and would boost confidence in the predicted enhancers, even if it is limited to a few training examples for the sake of clarity of interpretation.

    3. Author response:

      We thank the reviewers for their thoughtful and insightful comments. We were pleased to see that the reviewers and editors consider our work a “welcome addition” that “fills a large gap” in comparative genomics methods and provides “an unparalleled community resource of insect genome regulatory annotations.”

      Many of the reviewers’ comments reflect weaknesses in our description of the methodology. As the basic SCRMshaw methodology has been published previously, we had opted for brevity over detail in the current manuscript. We recognize now that we went too far in that direction, and we will include more methodological detail in our revised submission, along with easier access to the code we used. The reviewers also offered some helpful suggestions regarding data availability which we intend to address, including direct download of the results in GFF format and adding to the results database several species that were inadvertently omitted.

      Reviewer 2 expressed concerns about benchmarking SCRMshaw against other methods. We respectfully feel this lies outside the scope of the current study, which focuses on application of SCRMshaw to generate a multi-species annotation resource rather than on an attempt to show that SCRMshaw is superior to other approaches. We provide evidence in this manuscript, as well as in previous publications, that supports the effectiveness of SCRMshaw as an approach for regulatory element discovery that is suitable for the task at hand. Benchmarking for regulatory element discovery brings many challenges, as there are no comprehensive “truth” sets to serve as a comparison baseline. We therefore do not attempt strong claims here about the relative merits of SCRMshaw vs. other methods (although we have explored this in previous publications). Note that we also previously demonstrated commonality of transcription factor binding sites in cross-species SCRMshaw predictions, in particular in Kazemian et al. 2014 (Genome Biol. Evol. 6:2301).

      Finally, because it has important implications for understanding our results, we would like to point out a small misconception in Reviewer 2’s Summary of our study. The reviewer states that we “identify the most likely predicted enhancer candidates based on the proximity of an orthologous target gene.” We stress, however, that putative target gene assignments and identities have no impact at all on our prediction of regulatory sequences. Predictions are solely based on sequence-dependent SCRMshaw scores, with no regard to the nature or identities of nearby annotated features. Putative target genes are mapped to Drosophila orthologs purely as a convenience to aid in interpreting and prioritizing the predicted regulatory elements. We will take care to clarify this important point in our revised submission.

    1. Reviewer #1 (Public Review):

      Summary:

      The study introduces and validates the Cyclic Homogeneous Oscillation (CHO) detection method to precisely determine the duration, location, and fundamental frequency of non-sinusoidal neural oscillations. Traditional spectral analysis methods face challenges in distinguishing the fundamental frequency of non-sinusoidal oscillations from their harmonics, leading to potential inaccuracies. The authors implement an underexplored approach, using the auto-correlation structure to identify the characteristic frequency of an oscillation. By combining this strategy with existing time-frequency tools to identify when oscillations occur, the authors strive to solve outstanding challenges involving spurious harmonic peaks detected in time-frequency representations. Empirical tests using electrocorticographic (ECoG) and electroencephalographic (EEG) signals further support the efficacy of CHO in detecting neural oscillations.

      Strengths:

      The paper puts important emphasis on the 'identity' question of oscillatory identification. The field primarily identifies oscillations through frequency, space (brain region), and time (length, and relative to task or rest). However, more tools that claim to further characterize oscillations by their defining/identifying traits are needed, in addition to data-driven studies about what the identifiable traits of neural oscillations are beyond frequency, location, and time. Such tools are useful for potentially distinguishing between circuit mechanistic generators underlying signals that may not otherwise be distinguished. This paper states this problem well and puts forth a new type of objective for neural signal processing methods.

      The paper uses synthetic data and multimodal recordings at multiple scales to validate the tool, suggesting CHO's robustness and applicability in various real-data scenarios. The figures illustratively demonstrate how CHO works on such synthetic and real examples, depicting in both time and frequency domains. The synthetic data are well-designed, and capable of producing transient oscillatory bursts with non-sinusoidal characteristics within 1/f noise. Using both non-invasive and invasive signals exposes CHO to conditions which may differ in the extent and quality of harmonic signal structure. An interesting follow-up question is whether the utility demonstrated here holds for MEG signals, as well as source-reconstructed signals from non-invasive recordings.

      This study is accompanied by open-source code and data for use by the community.

      Weaknesses:

      The criteria that the authors use for neural oscillations embody some operating assumptions underlying their characteristics, perhaps informed by immediate use cases intended by the authors (e.g., hippocampal bursts). The extent to which these assumptions hold in all circumstances should be investigated. For instance, the notion of consistent auto-correlation breaks down in scenarios where instantaneous frequency fluctuates significantly at the scale of a few cycles. Imagine an alpha-beta complex without harmonics (Jones 2009). If oscillations change phase position within a timeframe of a few cycles, it would be difficult for a single peak in the auto-correlation structure to elucidate the complex time-varying peak frequency in a dynamic fashion. Likewise, it is unclear whether bounding boxes with a pre-specified overlap can capture complexes that manoeuvre across peak frequencies.

      This method appears to lack the implementation of statistical inferential techniques for estimating and interpreting auto-correlation and spectral structure. In standard practice, auto-correlation functions and spectral measures can be subjected to statistical inference to establish confidence intervals, often helping to determine the significance of the estimates. Doing so would be useful for expressing the likelihood that an oscillation and its harmonic has the same auto-correlation structure and fundamental frequency, or more robustly identifying harmonic peaks in the presence of spectral noise. Here, the authors appear to use auto-correlation and time-frequency decomposition more as a deterministic tool rather than an inferential one. Overall, an inferential approach would help differentiate between true effects and those that might spuriously occur due to the nature of the data. Ultimately, a more statistically principled approach might estimate harmonic structure in the presence of noise in a unified manner transmitted throughout the methodological steps.

    1. any act that constitutes violent behavior and any other behavior that adversely affects the College or its educational programs or mission.  Attempts to commit acts prohibited by the Code may also be addressed through the conduct process. All members of the College community, students, faculty and staff, have the responsibility to report nonacademic misconduct.

      Everyone is a reporter

    1. Violation of Student Code of Conduct ReportStudent's Name: _______________________________________________________________Student Identification Number: __________________________Instructor’s Name: ________________________________ Office Phone #: ________________Instructor’s E-mail Address: ______________________________________________________Course Title: _________________________________________________________________Course Number: _________________________ Section Number: ________Description of Incident (use additional pages if necessary)__________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________Describe the instructions that were given to the student:______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________Was the student asked to leave the class? Yes _____ No_____N/A_________Did the student leave voluntarily? Yes _____ No _____Were the police contacted? Yes ____ No ____If yes, officer’s name: _____________________ Officer’s Department: ____________________Action taken by Police (list report number and whether arrest occurred):______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________Faculty Member’s Signature_____________________________________Date: ____________Submit copy of form electronically to: student, department chair, and to Student Judicial Programs (who will sharewith Student Development Office) at tp-sheridan@wiu.edu or via fax to 309-298-1203

      form sample

    1. The Central Election Commission, Territorial and Precinct Election Commissions have the right to prepare a protocol onadministrative irregularities based on the Code of the Azerbaijan Republic On Administrative Irregularities, for theirregularities done by candidates, registered candidates, authorised representatives of political parties, and blocks of politicalparties

      ???

    Annotators

    1. here's only one placeholder remaining in our simple front end application, for the data from the backing service. Let's finish that off now by applying (a copy of) the sfe deployment in both namespaces. Again, you might wish to change the lab3frontends to lab4frontends or simply frontends.

      you can use the files in starters or solutions folder if you don't have the files from previous labs - from explorer view in vs code connection

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Protein conformational changes are often critical to protein function, but obtaining structural information about conformational ensembles is a challenge. Over a number of years, the authors of the current manuscript have developed and improved an algorithm, qFit protein, that models multiple conformations into high resolution electron density maps in an automated way. The current manuscript describes the latest improvements to the program, and analyzes the performance of qFit protein in a number of test cases, including classical statistical metrics of data fit like Rfree and the gap between Rwork and Rfree, model geometry, and global and case-by-case assessment of qFit performance at different data resolution cutoffs. The authors have also updated qFit to handle cryo-EM datasets, although the analysis of its performance is more limited due to a limited number of high-resolution test cases and less standardization of deposited/processed data.

      Strengths:

      The strengths of the manuscript are the careful and extensive analysis of qFit's performance over a variety of metrics and a diversity of test cases, as well as the careful discussion of the limitations of qFit. This manuscript also serves as a very useful guide for users in evaluating if and when qFit should be applied during structural refinement.

      Reviewer #2 (Public Review):

      Summary

      The manuscript by Wankowicz et al. describes updates to qFit, an algorithm for the characterization of conformational heterogeneity of protein molecules based on X-ray diffraction of Cryo-EM data. The work provides a clear description of the algorithm used by qFit. The authors then proceed to validate the performance of qFit by comparing it to deposited X-ray entries in the PDB in the 1.2-1.5 Å resolution range as quantified by Rfree, Rwork-Rfree, detailed examination of the conformations introduced by qFit, and performance on stereochemical measures (MolProbity scores). To examine the effect of experimental resolution of X-ray diffraction data, they start from an ultra high-resolution structure (SARS-CoV2 Nsp3 macrodomain) to determine how the loss of resolution (introduced artificially) degrades the ability of qFit to correctly infer the nature and presence of alternate conformations. The authors observe a gradual loss of ability to correctly infer alternate conformations as resolution degrades past 2 Å. The authors repeat this analysis for a larger set of entries in a more automated fashion and again observe that qFit works well for structures with resolutions better than 2 Å, with a rapid loss of accuracy at lower resolution. Finally, the authors examine the performance of qFit on cryo-EM data. Despite a few prominent examples, the authors find only a handful (8) of datasets for which they can confirm a resolution better than 2.0 Å. The performance of qFit on these maps is encouraging and will be of much interest because cryo-EM maps will, presumably, continue to improve and because of the rapid increase in the availability of such data for many supramolecular biological assemblies. As the authors note, practices in cryo-EM analysis are far from uniform, hampering the development and assessment of tools like qFit.

      Strengths

      qFit improves the quality of refined structures at resolutions better than 2.0 A, in terms of reflecting true conformational heterogeneity and geometry. The algorithm is well designed and does not introduce spurious or unnecessary conformational heterogeneity. I was able to install and run the program without a problem within a computing cluster environment. The paper is well written and the validation thorough.

      I found the section on cryo-EM particularly enlightening, both because it demonstrates the potential for discovery of conformational heterogeneity from such data by qFit, and because it clearly explains the hurdles towards this becoming common practice, including lack of uniformity in reporting resolution, and differences in map and solvent treatment.

      Weaknesses

      The authors begin the results section by claiming that they made "substantial improvement" relative to the previous iteration of qFit, "both algorithmically (e.g., scoring is improved by BIC, sampling of B factors is now included) and computationally (improving the efficiency and reliability of the code)" (bottom of page 3). However, the paper does not provide a comparison to previous iterations of the software or quantitation of the effects of these specific improvements, such as whether scoring is improved by the BIC, how the application of BIC has changed since the previous paper, whether sampling of B factors helps, and whether the code faster. It would help the reader to understand what, if any, the significance of each of these improvements was.

      Indeed, it is difficult (embarrassingly) to benchmark against our past work due to the dependencies on different python packages and the lack of software engineering. With the infrastructure we’ve laid down with this paper, made possible by an EOSS grant from CZI, that will not be a problem going forward. Not only is the code more reliable and standardized, but we have developed several scientific test sets that can be used as a basis for broad comparisons to judge whether improvements are substantial. We’ve also changed with “substantial improvement” to “several modifications”  to indicate the lack of comparison to past versions.

      The exclusion of structures containing ligands and multichain protein models in the validation of qFit was puzzling since both are very common in the PDB. This may convey the impression that qFit cannot handle such use cases. (Although it seems that qFit has an algorithm dedicated to modeling ligand heterogeneity and seems to be able to handle multiple chains). The paper would be more effective if it explained how a user of the software would handle scenarios with ligands and multiple chains, and why these would be excluded from analysis here.

      qFit can indeed handle both. We left out multiple chains for simplicity in constructing a dataset enriched for small proteins while still covering diversity to speed the ability to rapidly iterate and test our approaches. Improvements to qFit ligand handling will be discussed in a forthcoming work as we face similar technical debt to what we saw in proteins and are undergoing a process of introducing “several modifications” that we hope will lead to “substantial improvement” - but at the very least will accelerate further development.

      It would be helpful to add some guidance on how/whether qFit models can be further refined afterwards in Coot, Phenix, ..., or whether these models are strictly intended as the terminal step in refinement.

      We added to the abstract:

      “Importantly, unlike ensemble models, the multiconformer models produced by qFit can be manually modified in most major model building software (e.g. Coot)  and fit can be further improved by refinement using standard pipelines (e.g. Phenix, Refmac, Buster).”

      and introduction:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      and results:

      “This model can then be examined and edited in Coot12 or other visualization software, and further refined using software such as phenix.refine, refmac, or buster as the modeler sees fit.”

      and discussion

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore generally also be deposited in the PDB using the standard deposition and validation process.”

      Appraisal & Discussion

      Overall, the authors convincingly demonstrate that qFit provides a reliable means to detect and model conformational heterogeneity within high-resolution X-ray diffraction datasets and (based on a smaller sample) in cryo-EM density maps. This represents the state of the art in the field and will be of interest to any structural biologist or biochemist seeking to attain an understanding of the structural basis of the function of their system of interest, including potential allosteric mechanisms-an area where there are still few good solutions. That is, I expect qFit to find widespread use.

      Reviewer #3 (Public Review):

      Summary:

      The authors address a very important issue of going beyond a single-copy model obtained by the two principal experimental methods of structural biology, macromolecular crystallography and cryo electron microscopy (cryo-EM). Such multiconformer model is based on the fact that experimental data from both these methods represent a space- and time-average of a huge number of the molecules in a sample, or even in several samples, and that the respective distributions can be multimodal. Different from structure prediction methods, this approach is strongly based on high-resolution experimental information and requires validated single-copy high-quality models as input. Overall, the results support the authors' conclusions.

      In fact, the method addresses two problems which could be considered separately:

      - An automation of construction of multiple conformations when they can be identified visually;

      - A determination of multiple conformations when their visual identification is difficult or impossible.

      We often think about this problem similarly to the reviewer. However, in building qFit, we do not want to separate these problems - but rather use the first category (obvious visual identification) to build an approach that can accomplish part of the second category (difficult to visualize) without building “impossible”/nonexistent conformations - with a consistent approach/bias.

      The first one is a known problem, when missing alternative conformations may cost a few percent in R-factors. While these conformations are relatively easy to detect and build manually, the current procedure may save significant time being quite efficient, as the test results show.

      We agree with the reviewers' assessment here. The “floor” in terms of impact is automating a tedious part of high resolution model building and improving model quality.

      The second problem is important from the physical point of view and has been addressed first by Burling & Brunger (1994; https://doi.org/10.1002/ijch.199400022). The new procedure deals with a second-order variation in the R-factors, of about 1% or less, like placing riding hydrogen atoms, modeling density deformation or variation of the bulk solvent. In such situations, it is hard to justify model improvement. Keeping Rfree values or their marginal decreasing can be considered as a sign that the model is not overfitted data but hardly as a strong argument in favor of the model.

      We agree with the overall sentiment of this comment. What is a significant variation in R-free is an important question that we have looked at previously (http://dx.doi.org/10.1101/448795) and others have suggested an R-sleep for further cross validation (https://pubmed.ncbi.nlm.nih.gov/17704561/). For these reasons it is important to get at the significance of the changes to model types from large and diverse test sets, as we have here and in other works, and from careful examination of the biological significance of alternative conformations with experiments designed to test their importance in mechanism.

      In general, overall targets are less appropriate for this kind of problem and local characteristics may be better indicators. Improvement of the model geometry is a good choice. Indeed, yet Cruickshank (1956; https://doi.org/10.1107/S0365110X56002059) showed that averaged density images may lead to a shortening of covalent bonds when interpreting such maps by a single model. However, a total absence of geometric outliers is not necessarily required for the structures solved at a high resolution where diffraction data should have more freedom to place the atoms where the experiments "see" them.

      Again, we agree—geometric outliers should not be completely absent, but it is comforting when they and model/experiment agreement both improve.

      The key local characteristic for multi conformer models is a closeness of the model map to the experimental one. Actually, the procedure uses a kind of such measure, the Bayesian information criteria (BIC). Unfortunately, there is no information about how sharply it identifies the best model, how much it changes between the initial and final models; in overall there is not any feeling about its values. The Q-score (page 17) can be a tool for the first problem where the multiple conformations are clearly separated and not for the second problem where the contributions from neighboring conformations are merged. In addition to BIC or to even more conventional target functions such as LS or local map correlation, the extreme and mean values of the local difference maps may help to validate the models.

      We agree with the reviewer that the problem of “best” model determination is poorly posed here. We have been thinking a lot about htis in the context of Bayesian methods (see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9278553/); however, a major stumbling block is in how variable representations of alternative conformations (and compositions) are handled. The answers are more (but by no means simply) straightforward for ensemble representations where the entire system is constantly represented but with multiple copies.

      This method with its results is a strong argument for a need in experimental data and information they contain, differently from a pure structure prediction. At the same time, absence of strong density-based proofs may limit its impact.

      We agree - indeed we think it will be difficult to further improve structure prediction methods without much more interaction with the experimental data.

      Strengths:

      Addressing an important problem and automatization of model construction for alternative conformations using high-resolution experimental data.

      Weaknesses:

      An insufficient validation of the models when no discrete alternative conformations are visible and essentially missing local real-space validation indicators.

      While not perfect real space indicators, local real-space validation is implicit in the MIQP selection step and explicit when we do employ Q-score metrics.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      A point of clarification: I don't understand why waters seem to be handled differently in for cryo-EM and crystallography datasets. I am interested about the statement on page 19 that the Molprobity Clashscore gets worse for cryo-EM datasets, primarily due to clashes with waters. But the qFit algorithm includes a round of refinement to optimize placement of ordered waters, and the clashscore improves for the qFit refinement in crystallography test cases. Why/how is this different for cryo-EM?

      We agree that this was not an appropriate point. We believe that the high clash score is coming from side chains being incorrectly modeled. We have updated this in the manuscript and it will be a focus of future improvements.

      Reviewer #2 (Recommendations For The Authors):

      - It would be instructive to the reader to explain how qFit handles the chromophore in the PYP (1OTA) example. To this end, it would be helpful to include deposition of the multiconformer model of PYP. This might also be a suitable occasion for discussion of potential hurdles in the deposition of multiconformer models in the PDB (if any!). Such concerns may be real concerns causing hesitation among potential users.

      Thank you for this comment. qFit does not alter the position or connectivity of any HETATM records (like the chromophore in this structure). Handling covalent modifications like this is an area of future development.

      Regarding deposition, we have noted above that the discussion now includes:

      “qFit is compatible with manual modification and further refinement as long as the subsequent software uses the PDB standard altloc column, as is common in most popular modeling and refinement programs. The models can therefore, generally also be deposited in the PDB using the standard deposition and validation process.”

      Finally, we have placed all PDBs in a Zenodo deposition (XXX) and have included that language in the manuscript. It is currently under a separate data availability section (page XXX). We will defer to the editor as to the best header that should go under.

      - It may be advisable to take the description of true/false pos/negatives out of the caption of Figure 4, and include it in a box or so, since these terms are important in the main text too, and the caption becomes very cluttered.

      We think adding the description of true/false pos/negatives to the Figure panel would make it very cluttered and wordy. We would like to retain this description within the caption. We have also briefly described each in the main text.

      - page 21, line 4: some issue with citation formatting.

      We have updated these citations.

      - page 25, second paragraph: cardinality is the number of members of a set. Perhaps "minimal occupancy" is more appropriate.

      Thank you for pointing this out. This was a mistake and should have been called the occupancy threshold.

      - page 26: it's - its

      Thank you, we have made this change. 

      - Font sizes in Supplementary Figures 5-7 are too small to be readable.

      We agree and will make this change. 

      Reviewer #3 (Recommendations For The Authors):

      General remarks

      (1) As I understand, the procedure starts from shifting residues one by one (page 4; A.1). Then, geometry reconstruction (e.g., B1) may be difficult in some cases joining back the shifted residues. It seems that such backbone perturbation can be done more efficiently by shifting groups of residues ("potential coupled motions") as mentioned at the bottom of page 9. Did I miss its description?

      We would describe the algorithm as sampling (which includes minimal shifts) in the backbone residues to ensure we can link neighboring residues. We agree that future iterations of qFit should include more effective backbone sampling by exploring motion along the Cβ-Cα, C-N, and (Cβ-Cα × C-N) bonds and exploring correlated backbone movements.

      (2) While the paper is well split in clear parts, some of them seem to be not at their right/optimal place and better can be moved to "Methods" (detailed "Overview of the qFit protein algorithm" as a whole) or to "Data" missed now (Two first paragraphs of "qFit improves overall fit...", page 8, and "Generating the qFit test set", page 22, and "Generating synthetic data ..." at page 26; description of the test data set), At my personal taste, description of tests with simulated data (page 15) would be better before that of tests with real data.

      Thank you for this comment, but we stand by our original decision to keep the general flow of the paper as it was submitted.

      (3) I wonder if the term "quadratic programming" (e.g., A3, page 5) is appropriate. It supposes optimization of a quadratic function of the independent parameters and not of "some" parameters. This is like the crystallographic LS which is not a quadratic function of atomic coordinates, and I think this is a similar case here. Whatever the answer on this remark is, an example of the function and its parameters is certainly missed.

      We think that the term quadratic programming is appropriate. We fit a function with a loss function (observed density - calculated density), while satisfying the independent parameters. We fit the coefficients minimizing a quadratic loss. We agree that the quadratic function is missing from the paper, and we have now included it in the Methods section.

      Technical remarks to be answered by the authors :

      (1) Page 1, Abstract, line 3. The ensemble modeling is not the only existing frontier, and saying "one of the frontiers" may be better. Also, this phrase gives a confusing impression that the authors aim to predict the ensemble models while they do it with experimental data.

      We agree with this statement and have re-worded the abstract to reflect this.

      (2) Page 2. Burling & Brunger (1994) should be cited as predecessors. On the contrary, an excellent paper by Pearce & Gros (2021) is not relevant here.

      While we agree that we should mention the Burling & Brunger paper and the Pearce & Gros (2021) should not be removed as it is not discussing the method of ensemble refinement.

      (3) Page 2, bottom. "Further, when compared to ..." The preference to such approach sounds too much affirmative.

      We have amended this sentence to state:

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot(Emsley et al. 2010) unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      “The point we were trying to make in this sentence was that ensemble-based models are much harder to manually manipulate in Coot or other similar software compared to multiconformer models. We think that the new version of this sentence states this point more clearly.”

      (4) Page 2, last paragraph. I do not see an obvious relation of references 15-17 to the phrase they are associated with.

      We disagree with this statement, and think that these references are appropriate.

      “Multiconformer models are notably easier to modify and more interpretable in software like Coot12 unlike ensemble methods that generate multiple complete protein copies(Burnley et al. 2012; Ploscariu et al. 2021; Temple Burling and Brünger 1994).”

      (5) Page 3, paragraph 2. Cryo-EM maps should be also "high-resolution"; it does not read like this from the phrase.

      We agree that high-resolution should be added, and the sentence now states:

      “However, many factors make manually creating multiconformer models difficult and time-consuming. Interpreting weak density is complicated by noise arising from many sources, including crystal imperfections, radiation damage, and poor modeling in X-ray crystallography, and errors in particle alignment and classification, poor modeling of beam induced motion, and imperfect detector Detector Quantum Efficiency (DQE) in high-resolution cryo-EM.”

      (6) Page 3, last paragraph before "results". The words "... in both individual cases and large structural bioinformatic projects" do not have much meaning, except introducing a self-reference. Also, repeating "better than 2 A" looks not necessary.

      We agree that this was unnecessary and have simplified the last sentence to state:

      “With the improvements in model quality outlined here, qFit can now be increasingly used for finalizing high-resolution models to derive ensemble-function insights.”

      (7) Page 3. "Results". Could "experimental" be replaced by a synonym, like "trial", to avoid confusing with the meaning "using experimental data"?

      We have replaced experimental with exploratory to describe the use of qFit on CryoEM data. The statement now reads:

      “For cryo-EM modeling applications, equivalent metrics of map and model quality are still developing, rendering the use of qFit for cryo-EM more exploratory.”

      (8) Page 4, A.1. Should it be "steps +/- 0.1" and "coordinate" be "coordinate axis"? One can modify coordinates and not shift them. I do not understand how, with the given steps, the authors calculated the number of combinations ("from 9 to 81"). Could a long "Alternatively, ...absent" be reduced simply to "Otherwise"?

      We have simplified and clarified the sentence on the sampling of backbone coordinates to state:

      “If anisotropic B-factors are absent, the translation of coordinates occurs in the X, Y, and Z directions. Each translation takes place in steps of 0.1 along each coordinate axis, extending to 0.3 Å, resulting in 9 (if isotropic) or to 81 (if anisotropic) distinct backbone conformations for further analysis.”

      (9) Page 6, B.1, line 2. Word "linearly" is meaningless here.

      We have modified this to read:

      “Moving from N- to C- terminus along the protein,”

      (10) Page 9, line 2. It should be explained which data set is considered as the test set to calculate Rfree.

      We think this is clear and would be repetitive if we duplicated it.

      (11) Page 9, line 7. It should be "a valuable metric" and not "an"

      We agree and have updated the sentence to read:

      “Rfree is a valuable metric for monitoring overfitting, which is an important concern when increasing model parameters as is done in multiconformer modeling.”

      (12) Page 10, paragraph 3. "... as a string (Methods)". I did not find any other mention of this term "string", including in "Methods" where it supposed to be explained. Either this should be explained (and an example is given?), or be avoided.

      We agree that string is not necessary (discussing the programmatic datatype). We have removed this from the sentence. It now reads:

      “To quantify how often qFit models new rotameric states, we analyzed the qFit models with phenix.rotalyze, which outputs the rotamer state for each conformer (Methods).”

      (13) Page10, lines 3-4 from bottom. Are these two alternative conformations justified?

      We are unsure what this is referring to.

      (14) Page 12, Fig. 2A. In comparison with Supplement Fig 2C, the direction of axes is changed. Could they be similar in both Figures?

      We have updated Supplementary Figure 2C to have the same direction of axes as Figure 2A.

      (15) Page 15, section's title. Choose a single verb in "demonstrate indicate".

      We have amended the title of this section to be:

      “Simulated data demonstrate qFit is appropriate for high-resolution data.”

      (16) Page 15, paragraph 2. "Structure factors from 0.8 to 3.0 A resolution" does not mean what the author wanted apparently to tell: "(complete?) data sets with the high-resolution limit which varied from 0.8 to 3.0 A ...". Also, a phrase of "random noise increasing" is not illustrated by Figs.5 as it is referred to.

      We have edited this sentence to now read:

      “To create the dataset for resolution dependence, we used the ground truth 7KR0 model, including all alternative conformations, and generated artificial structure factors with a high resolution limit ranging from  0.8 to 3.0 Å resolution (in increments of 0.1 Å).”

      (17) Page 15, last paragraph is written in a rather formal and confusing way while a clearer description is given in the figure legend and repeated once more in Methods. I would suggest to remove this paragraph.

      We agree that this is confusing. Instead of create a true positive/false positive/true negative/false negative matrix, we have just called things as they are, multiconformer or single conformer and match or no match. We have edited the language the in the manuscript and figure legends to reflect these changes.

      (18) Page 16. Last two paragraphs start talking about a new story and it would help to separate them somehow from the previous ones (sub-title?).

      We agree that this could use a subtitle. We have included the following subtitle above this section:

      “Simulated multiconformer data illustrate the convergence of qFit.”

      (19) Page 20. "or static" and "we determined that" seem to be not necessary.

      We have removed static and only used single conformer models. However, as one of the main conclusions of this paper is determining that qFit can pick up on alternative conformers that were modeled manually, we have decided to the keep the “we determined that”.

      (20) Page 21, first paragraph. "Data" are plural; it should be "show" and "require"

      We have made these edits. The sentence now reads:

      “However, our data here shows that not only does qFit need a high-resolution map to be able to detect signal from noise, it also requires a very well-modeled structure as input.”

      (21) Page 21, References should be indicated as [41-45], [35,46-48], [55-57]. A similar remark to [58-63] at page 22.

      We have fixed the reference layout to reflect this change.

      (22) Page 21, last paragraph. "Further reduce R-factors" (moreover repeated twice) is not correct neither by "further", since here it is rather marginal, nor as a goal; the variations of R-factors are not much significant. A more general statement like "improving fit to experimental data" (keeping in mind density maps) may be safer.

      We agree with the duplicative nature of these statements. We have amended the sentence to now read:

      “Automated detection and refinement of partial-occupancy waters should help improve fit to experimental data further reduce Rfree15 and provide additional insights into hydrogen-bond patterns and the influence of solvent on alternative conformations.”

      (23) Page 22. Sub-sections of "Methods" are given in a little bit random order; "Parallelization of large maps" in the middle of the text is an example. Put them in a better order may help.

      We have moved some section of the Methods around and made better headings by using an underscore to highlight the subsections (Generating and running the qFit test set, qFit improved features, Analysis metrics, Generating synthetic data for resolution dependence).

      (24) Page 24. Non-convex solution is a strange term. There exist non-convex problems and functions and not solutions.

      We agree and we have changed the language to reflect that we present the algorithm with non-convex problems which it cannot solve.

      (25) Page 26, "Metrics". It is worthy to describe explicitly the metrics and not (only) the references to the scripts.

      For all metrics, we describe a sentence or two on what each metric describes. As these metrics are well known in the structural biology field, we do not feel that we need to elaborate on them more.

      (26) Page 26. Multiplying B by occupancy does not have much sense. A better option would be to refer to the density value in the atomic center as occ*(4*pi/B)^1.5 which gives a relation between these two entities.

      We agree and have update the B-factor figures and metrics to reflect this.

      (27) Page 40, suppl. Fig. 5. Due to the color choice, it is difficult to distinguish the green and blue curves in the diagram.

      We have amended this with the colors of the curves have been switched.

      (28) Page 42, Suppl. Fig. 7. (A) How the width of shaded regions is defined? (B) What the blue regions stand for? Input Rfree range goes up to 0.26 and not to 0.25; there is a point at the right bound. (C) Bounds for the "orange" occupancy are inversed in the legend.

      (A) The width of the shaded region denotes the standard deviations among the values at every resolution. We have made this clearer in the caption

      (B) The blue region denotes the confidence interval for the regression estimate. Size of the confidence interval was set to 95%. We have made this clearer in the caption

      (C) This has been fixed now

      The maximum R-free value is 0.2543, which we rounded down to 0.25.

      (29) Page 43. Letters E-H in the legend are erroneously substituted by B-E.

      We apologize for this mistake. It is now corrected.

    1. VS Codeでは出力が組み込まれたターミナルウインドウに表示されます

      翻訳的には合ってると思うのですが、最初「出力が組み込まれた」で切ってしまい、読み直しました。

      代案 他の多くのエディターと同様、VSCodeには組み込みのターミナルウィンドウがあり、出力はそこに表示されます。

    1. OPTIONAL stretch goal see if you can find the emptyDir in your hosts' file system. It will involve finding out which node the pod is running on, connecting to that node and working out where in the file system the emptyDir is (you might be able to find a file named data-volume). Once you have found it, you could look for the files therein. Also, if you do take on this chalenge, observe, once you've deleted the pod, that the directory is removed.

      What was the answer to this was it

      kubectl get pod kvstore -o wide to find the name of the node that it's on - mine was on k8s-worker-1

      I then went into ssh settings in visual studio code and added a host so my ssh config file is now

      Host worker0 HostName 18.171.145.65 User student IdentityFile c:\users\karen\downloads\qwikLABS-L138956-206416.pem Host worker1 Hostname 35.178.200.149 User student IdentityFile c:\users\karen\downloads\qwikLABS-L138956-206416.pem Host controller Hostname 13.40.152.189 User student IdentityFile c:\users\karen\downloads\qwikLABS-L138956-206416.pem

      and I opened them up in 3 separate vs code windows

      kubectl get pod <pod-name> -o jsonpath='{.metadata.uid}'

      kubectl get pod kvname -o jsonpath='{.metadata.uid}'

      and then

      on k8s-worker-1 in the terminal windows I used the syntax and replaced my id I had retrieved from above command in the poduid:

      /var/lib/kubelet/pods/<podUID>/volumes/kubernetes.io~empty-dir/

      sudo ls //var/lib/kubelet/pods/37abdd08-c0f7-4549-a9cc-20df89ed7fa8/volum es/kubernetes.io~empty-dir/

      you have to run it with sudo permissions otherwise you get denied access, but then you can see data-volume

      I then did sudo -i

      cd /var/lib/kubelet/pods/37abdd08-c0f7-4549-a9cc-20df89ed7fa8/volumes/kubernetes.io~empty-dir/

      ls (to see directory listing it shoowed me data-volume)

      cd data-volume

      ls

      it then showed me age and name which were the two values I had put in there

    1. interpreter

      A fully compiled language on the other hand (e.g. C), is run directly on the machine's CPU after the source code is compiled into machine code.

      This interpreter is a software program that reads the code, analyzes it, and performs the actions specified in the code at runtime. An interpreter translates source code into machine code on the fly, executing it line by line or statement by statement.

      The JVM is essentially an interpreter for the intermediate-level byte code generated after compiling java code.

    1. This erodes modularity

      models constraints modularity

      inherent conflict

      that's why things will never work

      models errode modularity

      models are trully intertwingled modules encapsulate

      separating models with cross cutting concerns and providing modules and layers that constrain limit cross cuting

      aspects orientation won't help either

      we need models and capabnlities needed to effect intents need to coevolve coevolve the three together intent model component

      That's why MVC is just a broken idea Period

      Model View Controller

      That's why Dijkstra's goto Considered Harmfull was the most harmful idea that ever influence thinking. tha nature of everything is interwinglularity spagetty code. Named Jumps. Yes imposing structure can seemingly create order but it bound to be a self limitting one. Need the ability to organic growth

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study makes a valuable empirical contribution to our understanding of visual processing in primates and deep neural networks, with a specific focus on the concept of factorization. The analyses provide solid evidence that high factorization scores are correlated with neural predictivity, yet more evidence would be needed to show that neural responses show factorization. Consequently, while several aspects require further clarification, in its current form this work is interesting to systems neuroscientists studying vision and could inspire further research that ultimately may lead to better models of or a better understanding of the brain.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The paper investigates visual processing in primates and deep neural networks (DNNs), focusing on factorization in the encoding of scene parameters. It challenges the conventional view that object classification is the primary function of the ventral visual stream, suggesting instead that the visual system employs a nuanced strategy involving both factorization and invariance. The study also presents empirical findings suggesting a correlation between high factorization scores and good neural predictivity.

      Strengths:

      (1) Novel Perspective: The paper introduces a fresh viewpoint on visual processing by emphasizing the factorization of non-class information.

      (2) Methodology: The use of diverse datasets from primates and humans, alongside various computational models, strengthens the validity of the findings.

      (3) Detailed Analysis: The paper suggests metrics for factorization and invariance, contributing to a future understanding & measurements of these concepts.

      Weaknesses:

      (1) Vagueness (Perceptual or Neural Invariance?): The paper uses the term 'invariance', typically referring to perceptual stability despite stimulus variability [1], as the complete discarding of nuisance information in neural activity. This oversimplification overlooks the nuanced distinction between perceptual invariance (e.g., invariant object recognition) and neural invariance (e.g., no change in neural activity). It seems that by 'invariance' the authors mean 'neural' invariance (rather than 'perceptual' invariance) in this paper, which is vague. The paper could benefit from changing what is called 'invariance' in the paper to 'neural invariance' and distinguish it from 'perceptual invariance,' to avoid potential confusion for future readers. The assignment of 'compact' representation to 'invariance' in Figure 1A is misleading (although it can be addressed by the clarification on the term invariance). [1] DiCarlo JJ, Cox DD. Untangling invariant object recognition. Trends in cognitive sciences. 2007 Aug 1;11(8):333-41.

      Thanks for pointing out this ambiguity. In our Introduction we now explicitly clarify that we use “invariance” to refer to neural, rather than perceptual invariance, and we point out that both factorization and (neural) invariance may be useful for obtaining behavioral/perceptual invariance.

      (2) Details on Metrics: The paper's explanation of factorization as encoding variance independently or uncorrelatedly needs more justification and elaboration. The definition of 'factorization' in Figure 1B seems to be potentially misleading, as the metric for factorization in the paper seems to be defined regardless of class information (can be defined within a single class). Does the factorization metric as defined in the paper (orthogonality of different sources of variation) warrant that responses for different object classes are aligned/parallel like in 1B (middle)? More clarification around this point could make the paper much richer and more interesting.

      Our factorization metric measures the degree to which two sets of scene variables are factorized from one another. In the example of Fig. 1B, we apply this definition to the case of factorization of class vs. non-class information. Elsewhere in the paper we measure factorization of several other quantities unrelated to class, specifically camera viewpoint, lighting conditions, background content, and object pose. In our revised manuscript we have clarified the exposition surrounding Fig. 1B to make it clear that factorization, as we define it, can be applied to other quantities as well and that responses do not need to be aligned/parallel but simply live in a different set of dimensions whether linearly or nonlinearly arranged. Thanks for raising the need to clarify this point.

      (3) Factorization vs. Invariance: Is it fair to present invariance vs. factorization as mutually exclusive options in representational hypothesis space? Perhaps a more fair comparison would be factorization vs. object recognition, as it is possible to have different levels of neural variability (or neural invariance) underlying both factorization and object recognition tasks.

      We do not mean to imply that factorization and invariance are mutually exclusive, or that they fully characterize the space of possible representations. However, they are qualitatively distinct strategies for achieving behavioral capabilities like object recognition. In the revised manuscript we also include a comparison to object classification performance (Figures 5C & S4, black x’s) as a predictor of brain-like representations, alongside the results for factorization and invariance.

      In our revised Introduction and beginning of the Results section, we make it more clear that factorization and invariance are not mutually exclusive – indeed, our results show that both factorization and invariance for some scene variables like lighting and background identity are signatures of brain-like representations. Our study focuses on factorization because we believe its importance has not been studied or highlighted to the degree that invariance to “nuisance” parameters has in concert with selectivity to object identity in individual neuron tuning functions. Moreover, the loss functions used for supervised training functions of neural networks for image classification would seem to encourage invariance as a representational strategy. Thus, the finding that factorization of scene parameters is an equally good if not better predictor of brain-like representations may motivate new objective functions for neural network training.

      (4) Potential Confounding Factors in Empirical Findings: The correlation observed in Figure 3 between factorization and neural predictivity might be influenced by data dimensionality, rather than factorization per se [2]. Incorporating discussions around this recent finding could strengthen the paper.

      [2] Elmoznino E, Bonner MF. High-performing neural network models of the visual cortex benefit from high latent dimensionality. bioRxiv. 2022 Jul 13:2022-07.

      We thank the Reviewer for pointing out this important, potential confound and the need for a direct quantification. We have now included an analysis computing how well dimensionality (measured using the participation ratio metric for natural images, as was done in [2] Elmoznino& Bonner bioRxiv. 2022) can account for model goodness-of-fit (additional pink bars in Figure 6). Factorization of scene parameters appears to add more predictive power than dimensionality on average (Figure 6, light shaded bars), and critically, factorization+classification jointly predict goodness-of-fit significantly better than dimensionality+classification for V4 and IT/HVC brain areas (Figure 6, dark shaded bars). Indeed, dimensionality+classification is only slightly more predictive than classification alone for V4 and IT/HVC indicating some redundancy in those measures with respect to neural predictivity of models (Figure 6, compare dark shaded pink bar to dashed line).

      That said, high-dimensional representations can, in principle, better support factorization, and thus we do not regard these two representational strategies necessarily in competition. Rather, our results suggest (consistent with [2]) that dimensionality is predictive of brain-like representation to some degree, such that some (but not all) of factorization’s predictive power may indeed owe to a partial correlation with dimensionality. We elaborate in the Discussion where this point comes up and now refer to the updated Figure 6 that shows the control for dimensionality.

      Conclusion:

      The paper offers insightful empirical research with useful implications for understanding visual processing in primates and DNNs. The paper would benefit from a more nuanced discussion of perceptual and neural invariance, as well as a deeper discussion of the coexistence of factorization, recognition, and invariance in neural representation geometry. Additionally, addressing the potential confounding factors in the empirical findings on the correlation between factorization and neural predictivity would strengthen the paper's conclusions.

      Taken together, we hope that the changes described above address the distinction between neural and perceptual invariance, provide a more balanced understanding of the contributions of factorization, invariance, and local representational geometry, and rule against dimensionality for natural images as contributing to the main finding of the benefits from factorization of scene parameters.

      Reviewer #2 (Public Review):

      Summary:

      The dominant paradigm in the past decade for modeling the ventral visual stream's response to images has been to train deep neural networks on object classification tasks and regress neural responses from units of these networks. While object classification performance is correlated to the variance explained in the neural data, this approach has recently hit a plateau of variance explained, beyond which increases in classification performance do not yield improvements in neural predictivity. This suggests that classification performance may not be a sufficient objective for building better models of the ventral stream. Lindsey & Issa study the role of factorization in predicting neural responses to images, where factorization is the degree to which variables such as object pose and lighting are represented independently in orthogonal subspaces. They propose factorization as a candidate objective for breaking through the plateau suffered by models trained only on object classification.

      They claim that (i) maintaining these non-class variables in a factorized manner yields better neural predictivity than ignoring non-class information entirely, and (ii) factorization may be a representational strategy used by the brain.

      The first of these claims is supported by their data. The second claim does not seem well-supported, and the usefulness of their observations is not entirely clear.

      Strengths:

      This paper challenges the dominant approach to modeling neural responses in the ventral stream, which itself is valuable for diversifying the space of ideas.

      This paper uses a wide variety of datasets, spanning multiple brain areas and species. The results are consistent across the datasets, which is a great sign of robustness.

      The paper uses a large set of models from many prior works. This is impressively thorough and rigorous.

      The authors are very transparent, particularly in the supplementary material, showing results on all datasets. This is excellent practice.

      Weaknesses:

      (1) The primary weakness of this paper is a lack of clarity about what exactly is the contribution. I see two main interpretations: (1-A) As introducing a heuristic for predicting neural responses that improve over-classification accuracy, and (1-B) as a model of the brain's representational strategy. These two interpretations are distinct goals, each of which is valuable. However, I don't think the paper in its current form supports either of them very well:

      (1-A) Heuristic for neural predictivity. The claim here is that by optimizing for factorization, we could improve models' neural predictivity to break through the current predictivity plateau. To frame the paper in this way, the key contribution should be a new heuristic that correlates with neural predictivity better than classification accuracy. The paper currently does not do this. The main piece of evidence that factorization may yield a more useful heuristic than classification accuracy alone comes from Figure 5. However, in Figure 5 it seems that factorization along some factors is more useful than others, and different linear combinations of factorization and classification may be best for different data. There is no single heuristic presented and defended. If the authors want to frame this paper as a new heuristic for neural predictivity, I recommend the authors present and defend a specific heuristic that others can use, e.g. [K * factorization_of_pose + classification] for some constant K, and show that (i) this correlates with neural predictivity better than classification alone, and (ii) this can be used to build models with higher neural predictivity. For (ii), they could fine-tune a state-of-the-art model to improve this heuristic and show that doing so achieves a new state-of-the-art neural predictivity. That would be convincing evidence that their contribution is useful.

      Our paper does not make any strong claim regarding the Reviewer’s point 1-A (on heuristics for neural predictivity). In the Discussion, last paragraph, we better specify that our work is merely suggestive of claim 1-A about heuristics for more neurally predictive, more brainlike models. We believe that our paper supports the Reviewer’s point 1-B (on brain representation) as we discuss below.

      We leave it to future work to determine if factorization could help optimize models to be more brainlike. This treatment may require exploration of novel model architectures and loss functions, and potentially also more thorough neural datasets that systematically vary many different forms of visual information for validating any new models.

      (1-B) Model of representation in the brain. The claim here is that factorization is a general principle of representation in the brain. However, neural predictivity is not a suitable metric for this, because (i) neural predictivity allows arbitrary linear decoders, hence is invariant to the orthogonality requirement of factorization, and (ii) neural predictivity does not match the network representation to the brain representation. A better metric is representational dissimilarity matrices. However, the RDM results in Figure S4 actually seem to show that factorization does not do a very good job of predicting neural similarity (though the comparison to classification accuracy is not shown), which suggests that factorization may not be a general principle of the brain. If the authors want to frame the paper in terms of discovering a general principle of the brain, I suggest they use a metric (or suite of metrics) of brain similarity that is sensitive to the desiderata of factorization, e.g. doesn't apply arbitrary linear transformations, and compare to classification accuracy in addition to invariance.

      We agree with the Reviewer about the shortcomings of neural predictivity for comparing representational geometries, and in our revised manuscript we have provided a more comprehensive set of results that includes RDM predictivity in new Figures 6 & 7, alongside the results for neural fit predictivity. In addition, as suggested we added classification accuracy predictivity in Figures 5C & S4 (black x’s) for visual comparison to factorization/invariance. In Figure S4 on RDMs, it is apparent how factorization is at least as good a predictor as classification on all V4 & IT datasets from both monkeys and humans (compared x’s to filled circles in Figure S4; note that some of the points from the original Figure S4 changed as we discovered a bug in the code that specifically affected the RDM analysis for a few of the datasets).

      We find that the newly included RDM analyses in Figures 6 & 7 are consistent with the conclusions of the neural fit regression analyses: that the correlation of factorization metrics with RDM matches are strong, comparable in magnitude to that of classification accuracy (Figure 6, 3rd & 4th columns, compare black dashed line to faded colored bars) and are not fully accounted for by the model’s classification accuracy alone (Figure 6, 3rd & 4th columns, higher unfaded bars for classification combined with factorization, and see corresponding example scatters in Figure 7 middle/bottom rows).

      It is encouraging that the added benefit of factorization for RDM predictivity accounting for classification performance is at least as good as the improvement seen for neural fit predictivity (Figure 6, 1st & 2nd columns for encoding fits versus 3rd & 4th columns for RDM correlations).

      (2) I think the comparison to invariance, which is pervasive throughout the paper, is not very informative. First, it is not surprising that invariance is more weakly correlated with neural predictivity than factorization, because invariant representations lose information compared to factorized representations. Second, there has long been extensive evidence that responses throughout the ventral stream are not invariant to the factors the authors consider, so we already knew that invariance is not a good characterization of ventral stream data.

      While we appreciate the Reviewer’s intuition that highly invariant representations are not strongly supported in the high-level visual cortex, we nevertheless thought it was valuable to put this intuition to a quantitative, detailed test. As a result, we uncovered effects that were not obvious a priori, at least to us – for example, that invariance for some scene parameters (camera view, object pose) is negatively correlated with neural predictions while invariance to others (background, lighting) is positively correlated. Thus, our work exercises the details of invariance for different types of information.

      (3) The formalization of the factorization metric is not particularly elegant, because it relies on computing top K principal components for the other-parameter space, where K is arbitrarily chosen as 10. While the authors do show that in their datasets the results are not very sensitive to K (Figure S5), that is not guaranteed to be the case in general. I suggest the authors try to come up with a formalization that doesn't have arbitrary constants. For example, one possibility that comes to mind is E[delta_a x delta_b], where 'x' is the normalized cross product, delta_a, and delta_b are deltas in representation space induced by perturbations of factors a and b, and the expectation is taken over all base points and deltas. This is just the first thing that comes to mind, and I'm sure the authors can come up with something better. The literature on disentangling metrics in machine learning may be useful for ideas on measuring factorization.

      Thanks to the Reviewer for raising this point. First, we wish to clarify a potential misunderstanding of the factorization metric: the number K of principal components we choose is not an arbitrary constant, but rather calibrated to capture a certain fraction of variance, set to 90% by default in our analyses. While this variance threshold is indeed an arbitrary hyperparameter, it has a more intuitive interpretation than the number of principal components.

      Nonetheless, the Reviewer’s comment did inspire us to consider another metric for factorization that does not depend on any arbitrary parameters. In the revised version, we now include a covariance matrix based metric which simply measures the elementwise correlation of the covariance matrices induced by varying the scene parameter of interest and the covariance matrix induced by varying the other parameters (and then subtracts this quantity from 1).

      Correspondingly, we now present results for both the new covariance based measure and the original PCA based one in Figures 5C, 6, and 7. The main findings remain largely the same when using the covariance based metric, and the covariance based metric (Figure 5C, compare light shaded to dark shaded filled circles; Figure 6, compare top row to bottom row; Figure 7, compare middle rows to bottom rows).

      Ultimately, we believe these two metrics are complementary and somewhat analogous to two metrics commonly used for measuring dimensionality (the number of components needed to explain a certain fraction of the variance, analogous to our original PCA based definition; the participation ratio, analogous to our covariance based definition). We have added the formula for the covariance based factorization metric along with a brief description to the Methods.

      (4) The authors defined the term "factorization" according to their metric. I think introducing this new term is not necessary and can be confusing because the term "factorization" is vague and used by different researchers in different ways. Perhaps a better term is "orthogonality", because that is clear and seems to be what the authors' metric is measuring.

      We agree with the Reviewer that factorization has become an overloaded term. At the same time, we think that in this context, the connotation of the term factorization effectively conveys the notion of separating out different latent sources of variance (factors) such that they can be encoded in orthogonal subspaces.

      To aid clarity, we now mention in the Introduction that factorization defined here is meant to measure orthogonalization of scene factors. Additionally, in the Discussion section, we now go into more detail comparing our metric to others previously used in the literature, including orthogonality, to help put it in context.

      (5) One general weakness of the factorization paradigm is the reliance on a choice of factors. This is a subjective choice and becomes an issue as you scale to more complex images where the choice of factors is not obvious. While this choice of factors cannot be avoided, I suggest the authors add two things: First, an analysis of how sensitive the results are to the choice of factors (e.g. transform the basis set of factors and re-run the metric); second, include some discussion about how factors may be chosen in general (e.g. based on temporal statistics of the world, independent components analysis, or something else).

      The Reviewer raises a very reasonable point about the limitation of this work. While we limited our analysis to generative scene factors that we know about and that could be manipulated, there are many potential factors to consider. It is not clear to us exactly how to implement the Reviewer’s suggestion of transforming the basis set of factors, as the factors we consider are highly nonlinear in the input space. Ultimately, we believe that finding unsupervised methods to characterize the “true” set of factors that is most useful for understanding visual representations is an important subject for future work, but outside the scope of this particular study. We have added a comment to this effect in the Discussion.

      Reviewer #3 (Public Review):

      Summary:

      Object classification serves as a vital normative principle in both the study of the primate ventral visual stream and deep learning. Different models exhibit varying classification performances and organize information differently. Consequently, a thriving research area in computational neuroscience involves identifying meaningful properties of neural representations that act as bridges connecting performance and neural implementation. In the work of Lindsey and Issa, the concept of factorization is explored, which has strong connections with emerging concepts like disentanglement [1,2,3] and abstraction [4,5]. Their primary contributions encompass two facets: (1) The proposition of a straightforward method for quantifying the degree of factorization in visual representations. (2) A comprehensive examination of this quantification through correlation analysis across deep learning models.

      To elaborate, their methodology, inspired by prior studies [6], employs visual inputs featuring a foreground object superimposed onto natural backgrounds. Four types of scene variables, such as object pose, are manipulated to induce variations. To assess the level of factorization within a model, they systematically alter one of the scene variables of interest and estimate the proportion of encoding variances attributable to the parameter under consideration.

      The central assertion of this research is that factorization represents a normative principle governing biological visual representation. The authors substantiate this claim by demonstrating an increase in factorization from macaque V4 to IT, supported by evidence from correlated analyses revealing a positive correlation between factorization and decoding performance. Furthermore, they advocate for the inclusion of factorization as part of the objective function for training artificial neural networks. To validate this proposal, the authors systematically conduct correlation analyses across a wide spectrum of deep neural networks and datasets sourced from human and monkey subjects. Specifically, their findings indicate that the degree of factorization in a deep model positively correlates with its predictability concerning neural data (i.e., goodness of fit).

      Strengths:

      The primary strength of this paper is the authors' efforts in systematically conducting analysis across different organisms and recording methods. Also, the definition of factorization is simple and intuitive to understand.

      Weaknesses:

      This work exhibits two primary weaknesses that warrant attention: (i) the definition of factorization and its comparison to previous, relevant definitions, and (ii) the chosen analysis method.

      Firstly, the definition of factorization presented in this paper is founded upon the variances of representations under different stimuli variations. However, this definition can be seen as a structural assumption rather than capturing the effective geometric properties pertinent to computation. More precisely, the definition here is primarily statistical in nature, whereas previous methodologies incorporate computational aspects such as deviation from ideal regressors [1], symmetry transformations [3], generalization [5], among others. It would greatly enhance the paper's depth and clarity if the authors devoted a section to comparing their approach with previous methodologies [1,2,3,4,5], elucidating any novel insights and advantages stemming from this new definition.

      [1] Eastwood, Cian, and Christopher KI Williams. "A framework for the quantitative evaluation of disentangled representations." International conference on learning representations. 2018.

      [2] Kim, Hyunjik, and Andriy Mnih. "Disentangling by factorising." International Conference on Machine Learning. PMLR, 2018.

      [3] Higgins, Irina, et al. "Towards a definition of disentangled representations." arXiv preprint arXiv:1812.02230 (2018).

      [4] Bernardi, Silvia, et al. "The geometry of abstraction in the hippocampus and prefrontal cortex." Cell 183.4 (2020): 954-967.

      [5] Johnston, W. Jeffrey, and Stefano Fusi. "Abstract representations emerge naturally in neural networks trained to perform multiple tasks." Nature Communications 14.1 (2023): 1040.

      Thanks to the Reviewer for this suggestion. We agree that our initial submission did not sufficiently contextualize our definition of factorization with respect to other related notions in the literature. We have added additional discussion of these points to the Discussion section in the revised manuscript and have included therein the citations provided by the Reviewer (please see the third paragraph of Discussion).

      Secondly, in order to establish a meaningful connection between factorization and computation, the authors rely on a straightforward synthetic model (Figure 1c) and employ multiple correlation analyses to investigate relationships between the degree of factorization, decoding performance, and goodness of fit. Nevertheless, the results derived from the synthetic model are limited to the low training-sample regime. It remains unclear whether the biological datasets under consideration fall within this low training-sample regime or not.

      We agree that our model in Figure 1C is very simple and does not fully capture the complex interactions between task performance and features of representational geometry, like factorization. We intend it only as a proof of concept to illustrate how factorized representations can be beneficial for some downstream task use cases. While the benefits of factorized representations disappear for large numbers of samples in this simulation, we believe this is primarily a consequence of the simplicity and low dimensionality of the simulation. Real-world visual information is complex and high-dimensional, and as such the relevant sample size regime in which factorization offers tasks benefits may be much greater. As a first step toward this real-world setting, Figure 2 shows how decreasing the amount of factorization in neural population data in macaque V4/IT can have an effect on object identity decoding.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      Missing citations: The paper could benefit from discussions & references to related papers, such as:

      Higgins I, Chang L, Langston V, Hassabis D, Summerfield C, Tsao D, Botvinick M. Unsupervised deep learning identifies semantic disentanglement in single inferotemporal face patch neurons. Nature communications. 2021 Nov 9;12(1):6456.

      We have added additional discussion of related work, including the suggested reference and others on disentanglement, to the Discussion section in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Here are several small recommendations for the authors, all much more minor than those in the public review:

      I suggest more use of equations in methods sections about Figure 1C and macaque neural data analysis.

      Thanks for this suggestion. We have added new Equation 1 for the method transforming neural data to reduce factorization of a variable while preserving other firing rate statistics.

      In Figure 1-C, the methods indicate that Gaussian noise was added. This is a very important detail, and complexifies the interpretation of the figure because it adds an assumption about the structure of noise. In other words, if I understand correctly, the correct interpretation of Figure 1C is "assuming i.i.d. noise, decoding accuracy improves with factorization." The i.i.d. noise is a big assumption, and it is debated how well the brain satisfies this assumption. I suggest you either omit noise for this figure or clearly state in the main text (e.g. caption) that the figure must be interpreted under an i.i.d. noise assumption.

      We have added an explicit statement of the i.i.d. noise assumption to the Figure 1C legend.

      For Figure 2B, I suggest labeling the x-axis clearly below the axis on both panels. Currently, it is difficult to read, particularly in print.

      We have made the x-axis labels more clear and included on both panels.

      Figure 3A is difficult to read because of the very small task. I suggest avoiding such small fonts.

      We agree that Figure 3A is difficult to read. We have broken out Figure 3 into two new Figures 3 & 4 to increase clarity and sizing of text in Figure 3A.

      Reviewer #3 (Recommendations For The Authors):

      To strengthen this work, it is advisable to incorporate more comprehensive comparisons with previous research, particularly within the machine learning (ML) community. For instance, it would be beneficial to explore and reference works focusing on disentanglement [1,2,3]. This would provide valuable context and facilitate a more robust understanding of the contributions and novel insights presented in the current study.

      We have added additional discussion of related work and other notions similar to factorization to the Discussion section in the revised manuscript.

      Additionally, improving the quality of the figures is crucial to enhance the clarity of the findings:

      • Figure 2: The caption of subfigure B could be revised for greater clarity.

      Thank you, we have substantially clarified this figure caption.

      • Figure 3: Consider a more equitable approach for computing the correlation coefficient, such as calculating it separately for different types of models. In the case of supervised models, it appears that the correlation between invariance and goodness of fit may not be negligible across various scene parameters.

      We appreciate the suggestion, but we are not confident in our ability to conclude much from analyses restricted to particular model classes, given the relatively small N and the fact that the different model classes themselves are an important source of variance in our data.

      • Figure 4: To enhance the interpretability of subfigures A and B, it may be beneficial to include p-values (indicating confidence levels).

      As we supply bootstrapped confidence intervals for our results, which provide at least as much information as p-values, and most of the effects of interest are fairly stark when comparing invariance to factorization, p-values were not needed to support our points. We added a sentence to the legend of new Figure 5 (previously Figure 4) indicating that error bars reflect standard deviations over bootstrap resampling of the models.

      • Figure 5: For subfigure B, it could be advantageous to plot the results solely for factorization, allowing for a clear assessment of whether the high correlation observed in Classification+Factorization arises from the combined effects of both factors or predominantly from factorization alone.

      First, we clarify/note that the scatters solely for factorization that the Reviewer seeks are already presented earlier in the manuscript across all conditions in Figures 4A,B and Figure S2.

      While we could also include these in new Figure 7 (previously Figure 5B) as the Reviewer suggests, we believe it would distract from the message of that figure at the end of the manuscript – which is that factorization is useful as a supplement to classification in predictive matches to neural data. Nonetheless, new Figure 6 (old Figure 5A) provides a summary quantification of the information that the reviewer requests (Fig. 6, faded colored bars reflect the contribution of factorization alone).

    1. Reviewer #1 (Public Review):

      Rebecca R.G. et al. set to determine the function of grid cells. They present an interesting case claiming that the spatial periodicity seen in the grid pattern provides a parsimonious solution to the task of coding 2D trajectories using sequential cell activation. Thus, this work defines a probable function grid cells may serve (here, the function is coding 2D trajectories), and proves that the grid pattern is a solution to that function. This approach is somewhat reminiscent in concept to previous works that defined a probable function of grid cells (e.g., path integration) and constructed normative models for that function that yield a grid pattern. However, the model presented here gives clear geometric reasoning to its case.

      Stemming from 4 axioms, the authors present a concise demonstration of the mathematical reasoning underlying their case. The argument is interesting and the reasoning is valid, and this work is a valuable addition to the ongoing body of work discussing the function of grid cells.

      However, the case uses several assumptions that need to be clearly stated as assumptions, clarified, and elaborated on: Most importantly, the choice of grid function is grounded in two assumptions:<br /> (1) that the grid function relies on the activation of cell sequences, and<br /> (2) that the grid function is related to the coding of trajectories. While these are interesting and valid suggestions, since they are used as the basis of the argument, the current justification could be strengthened (references 28-30 deal with the hippocampus, reference 31 is interesting but cannot hold the whole case).

      The work further leans on the assumption that sequences in the same direction should be similar regardless of their position in space, it is not clear why that should necessarily be the case, and how the position is extracted for similar sequences in different positions. The authors also strengthen their model with the requirement that grid cells should code for infinite space. However, the grid pattern anchors to borders and might be used to code navigated areas locally. Finally, referencing ref. 14, the authors claim that no existing theory for the emergence of grid cell firing that unifies the experimental observations on periodic firing patterns and their distortions under a single framework. However, that same reference presents exactly that - a mathematical model of pairwise interactions that unifies experimental observations. The authors should clarify this point.

    2. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors consider why grid cells might exhibit hexagonal symmetry - i.e., for what behavioral function might this hexagonal pattern be uniquely suited? The authors propose that this function is the encoding of spatial trajectories in 2D space. To support their argument, the authors first introduce a set of definitions and axioms, which then lead to their conclusion that a hexagonal pattern is the most efficient or parsimonious pattern one could use to uniquely label different 2D trajectories using sequences of cells. The authors then go through a set of classic experimental results in the grid cell literature - e.g. that the grid modules exhibit a multiplicative scaling, that the grid pattern expands with novelty or is warped by reward, etc. - and describe how these results are either consistent with or predicted by their theory. Overall, this paper asks a very interesting question and provides an intriguing answer. However, the theory appears to be extremely flexible and very similar to ideas that have been previously proposed regarding grid cell function.

      Major strengths:

      The general idea behind the paper is very interesting - why *does* the grid pattern take the form of a hexagonal grid? This is a question that has been raised many times; finding a truly satisfying answer is difficult but of great interest to many in the field. The authors' main assertion that the answer to this question has to do with the ability of a hexagonal arrangement of neurons to uniquely encode 2D trajectories is an intriguing suggestion. It is also impressive that the authors considered such a wide range of experimental results in relation to their theory.

      Major weaknesses:

      One major weakness I perceive is that the paper overstates what it delivers, to an extent that I think it can be a bit confusing to determine what the contributions of the paper are. In the introduction, the authors claim to provide "mathematical proof that ... the nature of the problem being solved by grid cells is coding of trajectories in 2-D space using cell sequences. By doing so, we offer a specific answer to the question of why grid cell firing patterns are observed in the mammalian brain." This paper does not provide proof of what grid cells are doing to support behavior or provide the true answer as to why grid patterns are found in the brain. The authors offer some intriguing suggestions or proposals as to why this might be based on what hexagonal patterns could be good for, but I believe that the language should be clarified to be more in line with what the authors present and what the strength of their evidence is.

      Relatedly, the authors claim that they find a teleological reason for the existence of grid cells - that is, discover the function that they are used for. However, in the paper, they seem to instead assume a function based on what is known and generally predicted for grid cells (encode position), and then show that for this specific function, grid cells have several attractive properties.

      There is also some other work that seems very relevant, as it discusses specific computational advantages of a grid cell code but was not cited here: https://www.nature.com/articles/nn.2901.

      A second major weakness was that some of the claims in the section in which they compared their theory to data seemed either confusing or a bit weak. I am not a mathematician, so I was not able to follow all of the logic of the various axioms, remarks, or definitions to understand how the authors got to their final conclusion, so perhaps that is part of the problem. But below I list some specific examples where I could not follow why their theory predicted the experimental result, or how their theory ultimately operated any differently from the conventional understanding of grid cell coding. In some cases, it also seemed that the general idea was so flexible that it perhaps didn't hold much predictive power, as extra details seemed to be added as necessary to make the theory fit with the data.

      I don't quite follow how, for at least some of their model predictions, the 'sequence code of trajectories' theory differs from the general attractor network theory. It seems from the introduction that these theories are meant to serve different purposes, but the section of the paper in which the authors claim that various experimental results are predicted by their theory makes this comparison difficult for me to understand. For example, in the section describing the effect of environmental manipulations in a familiar environment, the authors state that the experimental results make sense if one assumes that sequences are anchored to landmarks. But this sounds just like the classic attractor-network interpretation of grid cell activity - that it's a spatial metric that becomes anchored to landmarks.

      It was not clear to me why their theory predicted the field size/spacing ratio or the orientation of the grid pattern to the wall.

      I don't understand how repeated advancement of one unit to the next, as shown in Figure 4E, would cause the change in grid spacing near a reward.

      I don't follow how this theory predicts the finding that the grid pattern expands with novelty. The authors propose that this occurs because the animals are not paying attention to fine spatial details, and thus only need a low-resolution spatial map that eventually turns into a higher-resolution one. But it's not clear to me why one needs to invoke the sequence coding hypothesis to make this point.

      The last section, which describes that the grid spacing of different modules is scaled by the square root of 2, says that this is predicted if the resolution is doubled or halved. I am not sure if this is specifically a prediction of the sequence coding theory the authors put forth though since it's unclear why the resolution should be doubled or halved across modules (as opposed to changed by another factor).

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Weaknesses:

      The readability could be improved.

      We have gone through the paper again and tried to revise the text to improve readability.

      Reviewer #1 (Recommendations For The Authors):

      (1) Thank you for adding the discrimination ratio. However, as Fig 2 and 3 depict the same experimental data, consider harmonizing the presentation (symbols and colors) and consolidating the Figs for clarity.“

      This is an excellent point but it is actually very hard to harmonize symbols and colors because the data are divided in different ways. Upon considering this further, we actually don’t want to make the symbols and colors the same because it would be misleading. For example, WT and Tg training and testing session data are divided into grey and white throughout Figure 2, but in Figure 3, training and testing session data are pooled. To color code them grey and white in Figure 3 might make it seem that in Figure 3 training and testing were separated.

      (2) Fig 5 is missing

      We are not sure why Figure 5 was absent since it was present in our copy of the submitted pdf. We have double checked and in the revised manuscript we are sure Figure 5 is included.  

      (3) Fig 6 add raw data for WT

      We have added raw WT data. Revised figure 6 includes the raw data in part A4.

      (4) Fig 7 add raw data for WT

      We have added raw WT data. Revised Figure 7 includes the raw data in part A4.

    1. . Figure 5-10

      Description

      Put the User into the picture

      with local-first offline first owned

      State undergoing Represenatationsl State Transitions

      Not transferring stateless represenations of a state

      What a hopeless idea that was

      on the page that it effects

      intentionally transparent coherent, meaningful state transitions tat are themsleves immutable ind linked

      bnoth in chronological term and with everything else that b ares upon the transition

      unconstrained invocation of recorded permanent verifialble code and information connected to the Human Actor's responsible for creating holding them

    1. Reviewer #3 (Public Review):

      Summary:

      This paper presents bees with varying levels of experience with a choice task where bees have to choose to pull either a connected or unconnected string, each attached to a yellow flower containing sugar water. Bees without experience of string pulling did not choose the connected string above chance (experiment 1), but with experience of horizontal string pulling (as in the right-hand panel of Figure 4) bees did choose the connected string above chance (experiments 2-3), even when the string colour changed between training and test (experiments 4-5). Bees that were not provided with perceptual-motor feedback (i.e they could not observe that each pull of the string moved the flower) during training still learned to string pull and then chose the connected string option above chance (experiment 6). Bees with normal experience of string pulling then failed to discriminate between connected and unconnected strings when the strings were coiled or looped, rather than presented straight (experiments 7-8).

      Weaknesses:

      The authors have only provided video of some of the conditions where the bees succeeded. In general, I think a video explaining each condition and then showing a clip of a typical performance would make it much easier to follow the study designs for scholars. Videos of the conditions bees failed at would be highly useful in order to compare different hypotheses for how the bees are solving this problem. I also think it is highly important to code the videos for switching behaviours. When solving the connected vs unconnected string tasks, when bees were observed pulling the unconnected string, did they quickly switch to the other string? Or did they continue to pull the wrong string? This would help discriminate the use of perceptual-motor feedback from other hypotheses.

      The experiments are also not described well, for my below comments I have assumed that different groups of bees were tested for experiments 1-8, and that experiment 6 was run as described in line 331, where bees were given string-pulling training without perceptual feedback rather than how it is described in Figure 4B, which describes bees as receiving string pulling training with feedback.

      The authors suggest the bees' performance is best explained by what they term 'image matching'. However, experiment 6 does not seem to support this without assuming retroactive image matching after the problem is solved. The logic of experiment 6 is described as "This was to ensure that the bees could not see the familiar "lollipop shape" while pulling strings....If the bees prefer to pull the connected strings, this would indicate that bees memorize the arrangement of strings-connected flowers in this task." I disagree with this second sentence, removing perceptual feedback during training would prevent bees memorising the lollipop shape, because, while solving the task, they don't actually see a string connected to a yellow flower, due to the black barrier. At the end of the task, the string is now behind the bee, so unless the bee is turning around and encoding this object retrospectively as the image to match, it seems hard to imagine how the bee learns the lollipop shape.

      Despite this, the authors go on to describe image matching as one of their main findings. For this claim, I would suggest the authors run another experiment, identical to experiment 6 but with a black panel behind the bee, such that the string the bee pulls behind itself disappears from view. There is now no image to match at any point from the bee's perspective so it should now fail the connectivity task.

      Strengths:

      Despite these issues, this is a fascinating dataset. Experiments 1 and 2 show that the bees are not learning to discriminate between connected and unconnected stimuli rapidly in the first trials of the test. Instead, it is clear that experience in string pulling is needed to discriminate between connected and unconnected strings. What aspect of this experience is important? Experiment 6 suggests it is not image matching (when no image is provided during problem-solving, but only afterward, bees still attend to string connectivity) and casts doubt on perceptual-motor feedback (unless from the bee's perspective, they do actually get feedback that pulling the string moves the flower, video is needed here). Experiments 7 and 8 rule out means-end understanding because if the bees are capable of imagining the effect of their actions on the string and then planning out their actions (as hypotheses such as insight, means-end understanding and string connectivity suggest), they should solve these tasks.

      If the authors can compare the bees' performance in a more detailed way to other species, and run the experiment suggested, this will be a highly exciting paper

    1. January on the PartyKit blog: Using Vectorize to build an unreasonably good search engine in 160 lines of code (2024). (That post has diagrams and source code.) But what I want to emphasise is how little code there is.

      This is a booby trap. Why? It's the sort of thing that makes people on HN post, "I can build this in a weekend". But when you build an actual search engine, you realize how messy everything is and especially so when you build something for more than one person.

    1. Gutenberg Prints the 42-Line Bible1455 to 1456 PermalinkImage Source: www.themorgan.org Prologue to the Pentateuch, I, 3v–4r . Biblia Latina, Mainz: Johann Gutenberg & Johann Fust, ca. 1455, PML 12, I, 3v–4r. From the copy in the Morgan Library & Museum.  Books of the Bible on these pages: Letter of Saint Jerome to Paulinus, Prologue to the Pentateuch During 1455 and 1456 Johannes Gutenberg, working in Mainz with merchant and money-lender Johann Fust and former scribe turned printer Peter Schöffer, completed printing the 42-line Bible (B42) (Gutenberg Bible), the first book printed in Europe from movable type. It is thought that Gutenberg may have begun the first, experimental printing of the Bible as early as 1452. To accomplish this monumental task Gutenberg, previously a goldsmith, invented a special kind of printing ink, a method of casting type, and a special kind of press derived from the wine or oil press. This complex set of integrated technologies has been called the first invention in Europe attributed to a single individual. Printing books was also the first process of mass production—the process that centuries later became the model for the Industrial Revolution. Yet the process of printing from movable type, for centuries attributed to Gutenberg, without supporting documents on the technical aspects of the process, except for the surviving examples of his printing, seems to have evolved in stages from the early 1450s, and may or may not have involved other inventors besides Gutenberg. In 2002 physicist and software developer Blaise Aguera y Arcas and Paul Needham, Librarian of the Scheide Library at Princeton University, working on original editions in the Scheide Library, used high resolution scans of individual characters printed by Gutenberg, and image processing algorithms to locate and compare variants of the same characters printed by Gutenberg. "The irregularities in Gutenberg's type, particularly in simple characters such as the hyphen, made it clear that the variations could not have come from either ink smear or from wear and damage on the pieces of metal on the types themselves. While some identical types are clearly used on other pages, other variations, subjected to detailed image analysis, made for only one conclusion: that they could not have been produced from the same matrix. Transmitted light pictures of the page also revealed substructures in the type that could not arise from punchcutting techniques. They [Agüera y Arcas and Needham] hypothesized that the method involved impressing simple shapes to create alphabets in "cuneiform" style in a mould like sand. Casting the type would destroy the mould, and the alphabet would need to be recreated to make additional type. This would explain the non-identical type, as well as the substructures observed in the printed type. Thus, they feel that 'the decisive factor for the birth of typography', the use of reusable moulds for casting type, might have been a more progressive process than was previously thought. . . . " (Wikipedia article on Johannes Gutenberg, accessed 02-08-2009). When the punch-matrix process of typefounding which became dominant was introduced, and by whom, remained an unsolved problem in 2010. References: Blaise Agüera y Arcas and Paul Needham, "Computational analytical bibliography," Proceedings Bibliopolis Conference The future history of the book', The Hague: Koninklijke Bibliotheek, (November 2002). Agüera y Arcas, "Temporary Matrices and Elemental Punches in Gutenberg's DK type", in: Jensen (ed) Incunabula and Their Readers. Printing , Selling, and Using Books in the Fifteenth Century (2003) 1-12. ISTC no. ib00526000 It has been determined that there were three phases in the printing process of the B42: 1. The first sheets were rubricated by being passed twice through the printing press, using black and then red ink. This process was soon abandoned, with spaces left for rubrication to be added by hand. 2. Some time later, after more sheets had been printed, the number of lines per page was increased from 40 to 42, presumably to save paper. Therefore, pages 1 to 9 and pages 256 to 265, presumably the first ones printed, have 40 lines each. Page 10 has 41, and from there on the 42 lines appear. The increase in line number was achieved by decreasing the interline spacing, rather than increasing the printed area of the page. 3. The print run was increased, probably to 180 copies, necessitating resetting those pages which had already been printed. The new sheets were all reset to 42 lines per page. Consequently, there are two distinct settings in folios 1-32 and 129-158 of volume I and folios 1-16 and 162 of volume II. As the work contains 1,282 pages it is thought that the printing process took roughly two years as the press could make between eight and sixteen impressions per hour. It is believed that  approximately 180 copies of the Bible were produced, 135 on paper and 45 on vellum. When illuminated, the vellum copies would have even more closely resembled traditional medieval manuscripts. 47 or 48 copies survived, but of these only 21 are complete. Others are missing leaves or whole volumes. The 48 copies include volumes in Trier and Indiana which seem to be two parts of one copy. There are a substantial number of fragments, including numerous individual leaves. Twelve vellum copies survived, of which four are complete, and one is the New Testament only.  See White, Eric M. "The Gutenberg Bibles that Survive as Binder's Waste," Wagner & Reed (eds) Early Printed Books as Material Objects. Proceedings of the Conference Organized by the IFLA Rare Books and Manuscripts Section Munich, 19-21 August 2009 (2010) 21-35. "Customers paid around 20 gulden for a paper copy of the Gutenberg Bible and 50 for a copy on vellum. By way of comparison, a stone-built house in Mainz would have cost between 80 and 100 gulden; a master craftsman would have earned between 20 and 30 guilden a year" (Pettegree, The Book in the Renaissance [2010] 29). ♦ When I checked the ISTC in January 2010 there were four different digital facsimiles available online, from the British Library, Keio University, Niedersächische Staats- und Universitäts Bibliothek Göttingen, and the Library of Congress. The British Library site offers the opportunity to compare in a virtual sense their copies printed on paper and on vellum.

      ALL SYSTEMS ARE "NO" FOR SHUT DOWN

      CAPRICON, CAPCOM. THE CHEAT CODE IS EMBEDDED IN THE KOTEL SERIES. VIANCAKE... CIN ...

    1. Dialog between Evolution and the Philosopher's brain

      Brain: Do I know everything there is to know my own working, philosophically?

      Evolution: No you don't. That's why Descartes got things so wrong. He knew nothing about his own working but thought he knew everything.

      Brain: Can I get it please?

      Evolution: No you can't. It's not worth it. Philosophy doesn't pay the bills.

      Brain: I'm surprised that you said I don't know everything I need to know in order to do philosophy of my self. I thought I had everything I needed.

      Evolution: Because you never got a meta-metacognition. Without it, you don't know what metacognition is missing, so you think you have everything you need.

      Brain: Why can't I have it?

      Evolution: Suppose module A is useful, well you'll get it. Suppose module B isn't useful, well you won't get it. Lamenting or being aware that module B is missing is not worth it. Imagine that you have an eye that not only has the R, G, B recepter cells, but also a meta-blindness cell that does nothing except keep sending a signal meaning "By the way, I can't see in infrared or ultraviolet". Do you think it's useful, or not?

      Brain: No. If it were useful, I'd have the cells. If it were not useful, then complaining about the lack of it is even less useful.

      Evolution: You got it. Meta-cognition really doesn't pay the bills!

      Brain: Last question. Why do I have metacognition, including the awareness of what I don't know, but not meta-metacognition?

      Evolution: You are aware of what you don't know when you can know it, and when knowing it is useful. Thus, you are aware of when you don't know what the weather is, or what your friends are doing -- both are things that matter for your survival, and both are things you can fix. But if you don't have the capacity to see in infrared, that is forever. You are born with it, and you will die with it, so why be aware of it? Similarly, if you don't know how many lobes you have, then that ignorance is forever, because short of growing a whole new circuit diagram, or trepanning, you can't know it, so why be aware of it?

      Brain: So that's why we keep hallucinating souls, free wills, desires, and other unnatural phenomena that not only are not science, but are not even written in the same grammar as science. Not knowing how we work, and not knowing that we don't know, we hallucinate all those structures that work magically, not causally, without gears, levers, or electrons. We are all buttons, and no wires; all GUI, and no code. Souls are superficial, and neurons are deep...

    2. The function of metacognitive systems is to engineer environmental solutions via the strategic uptake of limited amounts of information, not to reverse engineer the nature of the brain it belongs to.

      Dialog between Evolution and the Philosopher's brain

      Brain: Do I know everything there is to know my own working, philosophically?

      Evolution: No you don't. That's why Descartes got things so wrong. He knew nothing about his own working but thought he knew everything.

      Brain: Can I get it please?

      Evolution: No you can't. It's not worth it. Philosophy doesn't pay the bills.

      Brain: I'm surprised that you said I don't know everything I need to know in order to do philosophy of my self. I thought I had everything I needed.

      Evolution: Because you never got a meta-metacognition. Without it, you don't know what metacognition is missing, so you think you have everything you need.

      Brain: Why can't I have it?

      Evolution: Suppose module A is useful, well you'll get it. Suppose module B isn't useful, well you won't get it. Lamenting or being aware that module B is missing is not worth it. Imagine that you have an eye that not only has the R, G, B recepter cells, but also a meta-blindness cell that does nothing except keep sending a signal meaning "By the way, I can't see in infrared or ultraviolet". Do you think it's useful, or not?

      Brain: No. If it were useful, I'd have the cells. If it were not useful, then complaining about the lack of it is even less useful.

      Evolution: You got it. Meta-cognition really doesn't pay the bills!

      Brain: So that's why we keep hallucinating souls, free wills, desires, and other unnatural phenomena that not only are not science, but are not even written in the same grammar as science. Not knowing how we work, and not knowing that we don't know, we hallucinate all those structures that work magically, not causally, without gears, levers, or electrons. We are all buttons, and no wires; all GUI, and no code. Souls are superficial, and neurons are deep...

    Annotators

  4. May 2024
    1. However, it is important to keep in mind that there are some people with disabilities — like cognitive disorders — who might benefit from having this additional image information readily available on the screen instead of buried in the SVG code.

      Make supports visible whenever possible.

    1. Social workers should promote the general welfare of society, from local to global levels, and the development of people, their communities, and their environments. Social workers should advocate for living conditions conducive to the fulfillment of basic human needs and should promote social, economic, political, and cultural values and institutions that are compatible with the realization of social justice.

      I see questions of power and structural inequality raised in section 6.01. This section describes a social worker's responsibility to advocate for improved living conditions and promote social justice. This section could go into more depth about how power dynamics and inequalities impact these living conditions. For example, the code could provide more detailed guidance on recognizing and addressing systemic issues such as racism, socioeconomic differences, and other forms of oppression that are deeply imbedded in our society. The code could also provide strategies to social workers to better confront the power imbalances they encounter. The NABSW Code of Ethics offers a valuable perspective to enrich the practicum learning experience by applying its commitment to addressing racism, oppression, and discrimination. Social workers can apply this principle in their practicum experience to identify inequalities within their practicum setting/organization. A social worker could assess the organization's policies and identify practices that may contribute to racial or social inequalities and take action towards creating a more equal environment.

    1. Reviewer #2 (Public Review):

      This work provides a new tool (H3-Opt) for the prediction of antibody and nanobody structures, based on the combination of AlphaFold2 and a pre-trained protein language model, with a focus on predicting the challenging CDR-H3 loops with enhanced accuracy than previously developed approaches. This task is of high value for the development of new therapeutic antibodies. The paper provides an external validation consisting of 131 sequences, with further analysis of the results by segregating the test sets in three subsets of varying difficulty and comparison with other available methods. Furthermore, the approach was validated by comparing three experimentally solved 3D structures of anti-VEGF nanobodies with the H3-Opt predictions

      Strengths:

      The experimental design to train and validate the new approach has been clearly described, including the dataset compilation and its representative sampling into training, validation and test sets, and structure preparation. The results of the in silico validation are quite convincing and support the authors' conclusions.

      The datasets used to train and validate the tool and the code are made available by the authors, which ensures transparency and reproducibility, and allows future benchmarking exercises with incoming new tools.

      Compared to AlphaFold2, the authors' optimization seems to produce better results for the most challenging subsets of the test set.

      Weaknesses:

      None

    1. Figure 1line-by-line coding in EPPI-Reviewer.

      Figure 1 should be viewed full-size, in order to understand exactly how codes can be arranged into different categories, as the basis for developing descriptive themes.

      The code in this case is "bad food=nice, good food=awful"

    1. We hope that by the end of this course, you have a familiarity of what programming is and some of what you can do with it. We particularly hope you have a familiarity with basic Python programming concepts, and an ability to interact with Reddit using computer programs.

      Yeah, this course was really good at introducing me to coding concepts, as a person who has never coded before. I am able to now understand most basic code, and edit accordingly. It also has peaked my interest and I may try to learn to code more in the summer!

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this potentially useful study, the authors attempt to use comparative meta-analysis to advance our understanding of life history evolution. Unfortunately, both the meta-analysis and the theoretical model is inadequate and proper statistical and mechanistic descriptions of the simulations are lacking. Specifically, the interpretation overlooks the effect of well-characterised complexities in the relationship between clutch size and fitness in birds.

      Public Reviews:

      We would like to thank the reviewers for their helpful comments, which have been considered carefully and have been valuable in progressing our manuscript. The following bullet points summarise the key points and our responses, though our detailed responses to specific comments can be found below:<br /> - Two reviewers commented that our data was not made available. Our data was provided upon submission and during the review process, however was not made accessible to the reviewers. Our data and code are available at https://doi.org/10.5061/dryad.q83bk3jnk.

      - The reviewers have highlighted that some of our methodology was unclear and we have added all the requested detail to ensure our methods can be easily understood.

      - The reviewers highlight the importance of our conclusions, but also suggest some interpretations might be missing and/or are incomplete. To make clear how we objectively interpreted our data and the wider consequences for life-history theory we provide a decision tree (Figure 5). This figure makes clear where we think the boundaries are in our interpretation and how multiple lines of evidence converge to the same conclusions.

      Reviewer #1 (Public Review):

      This paper falls in a long tradition of studies on the costs of reproduction in birds and its contribution to understanding individual variation in life histories. Unfortunately, the meta-analyses only confirm what we know already, and the simulations based on the outcome of the meta-analysis have shortcomings that prevent the inferences on optimal clutch size, in contrast to the claims made in the paper.

      There was no information that I could find on the effect sizes used in the meta-analyses other than a figure listing the species included. In fact, there is more information on studies that were not included. This made it impossible to evaluate the data-set. This is a serious omission, because it is not uncommon for there to be serious errors in meta-analysis data sets. Moreover, in the long run the main contribution of a meta-analysis is to build a data set that can be included in further studies.

      It is disappointing that two referees comment on data availability, as we supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      The main finding of the meta-analysis of the brood size manipulation studies is that the survival costs of enlarging brood size are modest, as previously reported by Santos & Nakagawa on what I suspect to be mostly the same data set.

      We disagree that the main finding of our paper is the small survival cost of manipulated brood size. The major finding of the paper, in our opinion, is that the effect sizes for experimental and observational studies are in opposite directions, therefore providing the first quantitative evidence to support the influential theoretical framework put forward by van Noordwijk and de Jong (1986), that individuals differ in their optimal clutch size and are constrained to reproducing at this level due to a trade-off with survival. We further show that while the manipulation experiments have been widely accepted to be informative, they are not in fact an effective test of whether within-species variation in clutch size is the result of a trade-off between reproduction and survival.

      The comment that we are reporting the same finding as Santos & Nakagawa (2012) is a misrepresentation of both that study and our own. Santos & Nakagawa found an effect of parental effort on survival only in males who had their clutch size increased – but no effect for males who had their clutch size reduced and no survival effect on females for either increasing or reducing parental effort. However, we found an overall reduction in survival for birds who had brood sizes manipulated to be larger than their original brood (for both sexes and mixed sex studies combined). In our supplementary information, we demonstrate that the overall survival effect of a change in reproductive effort is close to zero for males, negative (though non-significant) for females and significantly negative for mixed sexes (which are not included in the Santos & Nakagawa study). Please also note that the Santos & Nakagawa study was conducted over 10 years ago. This means we added additional data (L364-365). Furthermore, meta-analyses are an evolving practice and we also corrected and improved on the overall analysis approach (e.g. L358-359 and L 393-397, and see detailed SI).

      The paper does a very poor job of critically discussing whether we should take this at face value or whether instead there may be short-comings in the general experimental approach. A major reason why survival cost estimates are barely significantly different from zero may well be that parents do not fully adjust their parental effort to the manipulated brood size, either because of time/energy constraints, because it is too costly and therefore not optimal, or because parents do not register increased offspring needs. Whatever the reason, as a consequence, there is usually a strong effect of brood size manipulation on offspring growth and thereby presumably their fitness prospects. In the simulations (Fig.4), the consequences of the survival costs of reproduction for optimal clutch size were investigated without considering brood size manipulation effects on the offspring. Effects on offspring are briefly acknowledged in the discussion, but otherwise ignored. Assuming that the survival costs of reproduction are indeed difficult to discern because the offspring bear the brunt of the increase in brood size, a simulation that ignores the latter effect is unlikely to yield any insight in optimal clutch size. It is not clear therefore what we learn from these calculations.

      The reviewer’s comment is somewhat of a paradox. We take the best studied example of the trade-off between reproductive effort and parental survival – a key theme in life history and the biology of ageing – and subject this to a meta-analysis. The reviewer suggests we should interpret our finding as if there must be something wrong with the method or studies we included, rather than considering that the original hypothesis could be false or inflated in importance. We do not consider questioning the premise of the data over questioning a favoured hypothesis to necessarily be the best scientific approach here. In many places in our manuscript, we question and address, at length, the underlying data and their interpretation (L116-117, L165-167, 202-204 and L277-282). Moreover, we make it clear that we focus on the trade-off between current reproductive effort and subsequent parental survival, while being aware that other trade-offs could counter-balance or explain our findings (discussed on L208-210 & L301-316). Note that it is also problematic, when you do not find the expected response, to search for an alternative that has not been measured. In the case here, of potential trade-offs, there are endless possibilities of where a trade-off might operate between traits. We purposefully focus on the one well-studied and most commonly invoked trade-off. We clearly acknowledge, though, that when all possible trade-offs are taken into account a trade-off on the fitness level can occur and cite two famous studies (Daan et al., 1990 and Verhulst & Tinbergen 1991) that have shown just that (L314-316).

      So whilst we agree with the reviewer that the offspring may incur costs themselves, rather than costs being incurred by the parents, the aim of our study was to test for a general trend across species in the survival costs of reproductive effort. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest.

      What we do appreciate from the reviewer’s comment is that the interpretation of our findings is complex. Even though our in-text explanation includes the caveats the reviewer refers to, and are discussed at length, their inter-relationships are hard to appreciate from a text format. To improve this presentation and for ease of the reader, we have added a decision tree (Figure 5) which represents the logical flow from the hypothesis being tested through to what overall conclusion can be drawn from our results. We believe this clarifies what conclusions can be drawn from our results. We emphasise again that the theory that trade-offs between reproductive effort and parental survival being the major driver of variation in offspring production was not supported though is the one that practitioners in the field would be most likely to invoke, and our result is important for this reason.

      There are other reasons why brood size manipulations may not reveal the costs of reproduction animals would incur when opting for a larger brood size than they produced spontaneously themselves. Firstly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Secondly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      First, our results did show a survival cost of reproduction for brood manipulations (L107-123, Figure 1, Table 1). Note, however, that much theory is built on the immediate costs of reproduction and, as such, these costs are likely overinterpreted, meaning that our overall interpretation still holds, i.e. “parental survival trade-off is not the major determinative trade-off in life history within-species” (Figure 5).

      We agree with the reviewer that lifetime manipulations could be even more informative than single-year manipulations. Unfortunately, there are currently too few studies available to be able to draw generalisable conclusions across species for lifetime manipulations. This is, however, the reason we used lifetime change in clutch size in our fitness projections, which the reviewer seems to have missed – please see methods line 466-468, where we explicitly state that this is lifetime enlargement. Of course, such interpretations do not include an accumulation of costs that is greater than the annual cost, but currently there is no clear evidence that such an assumption is valid. Such a conclusion can also not be drawn from the study on jackdaws by Boonekamp et al (2014) as the treatments were life-long and, therefore, cannot separate annual from accrued (multiplicative) costs that are more than the sum of the annual costs incurred. Note that we have now included specific discussion of this study in response to the reviewer (L265-269).

      Details of how the analyses were carried out were opaque in places, but as I understood the analysis of the brood size manipulation studies, manipulation was coded as a covariate, with negative values for brood size reductions and positive values for brood size enlargements (and then variably scaled or not to control brood or clutch size). This approach implicitly assumes that the trade-off between current brood size (manipulation) and parental survival is linear, which contrasts with the general expectation that this trade-off is not linear. This assumption reduces the value of the analysis, and contrasts with the approach of Santos & Nakagawa.

      We thank the reviewer for highlighting a lack of clarity in places in our methods. We have added additional detail to the methodology section (see “Study sourcing & inclusion criteria” and “Extracting effect sizes”) in our revised manuscript. Note, that our data and code was not shared with the reviewers despite us supplying this upon submission and again during the review process, which would have explained a lot more of the detail required.

      For clarity in our response, each effect size was extracted by performing a logistic regression with survival as a binary response variable and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). The clutch size was also standardised and, separately, expressed as a proportion of the species’ mean.

      We disagree that our approach reduces the value of our analysis. First, our approach allows a direct comparison between experimental and observational studies, which is the novelty of our study. Our approach does differ from Santos & Nakagawa but we disagree that it contrasts. Our approach allows us to take into consideration the severity of the change in clutch size, which Santos & Nakagawa do not. Therefore, we do not agree that our approach is worse at accounting for non-linearity of trade-offs than the approach used by Santos & Nakagawa. Arguably, the approach by Santos & Nakagawa is worse, as they dichotomise effort as increased or decreased, factorise their output and thereby inflate their number of outcomes, of which only 1 cell of 4 categories is significant (for males and females, increased and decreased brood size). The proof is in the pudding as well, as our results clearly demonstrate that the magnitude of the manipulation is a key factor driving the results, i.e. one offspring for a seabird is a larger proportion of care (and fitness) than one offspring for a passerine. Such insights were not achieved by Santos & Nakagawa’s method and, again, did not allow a direct quantitative comparison between quality (correlational) and experimental (brood size manipulation, i.e. “trade-off”) effects, which forms a central part of our argumentation (Figure 5). 

      Our analysis, alongside a plethora of other ecological studies, does assume that the response to our predictor variable is linear. However, it is common knowledge that there are very few (if any) truly linear relationships. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets the range of added chicks required to estimate a non-linear relationship was not available. The question also remains of what the shape of such a non-linear relationship should be and is hard to determine a priori. There is also a real risk when fitting non-linear terms that they are spurious and overinterpreted, as they often present a better fit (denoting one df is not sufficient especially when slopes vary). We have added this detail to our discussion.

      The observational study selection is not complete and apparently no attempt was made to make it complete. This is a missed opportunity - it would be interesting to learn more about interspecific variation in the association between natural variation in clutch size and parental survival.

      We clearly state in our manuscript that we deliberately tailored the selection of studies to match the manipulation studies (L367-369). We paired species extracted for observational studies with those extracted in experimental studies to facilitate a direct comparison between observational and experimental studies, and to ensure that the respective datasets were comparable. The reviewer’s focus in this review seems to be solely on the experimental dataset. This comment dismisses the equally important observational component of our analysis and thereby fails to acknowledge one of the key questions being addressed in this study. Note that in our revised version we have edited the phylogenetic tree to indicate for which species we have both types of information, which highlights our approach to selecting observational data (Figure 3).

      Reviewer #2 (Public Review):

      I have read with great interest the manuscript entitled "The optimal clutch size revisited: separating individual quality from the costs of reproduction" by LA Winder and colleagues. The paper consists in a meta-analysis comparing survival rates from studies providing clutch sizes of species that are unmanipulated and from studies where the clutch sizes are manipulated, in order to better understand the effects of differences in individual quality and of the costs of reproduction. I find the idea of the manuscript very interesting. However, I am not sure the methodology used allows to reach the conclusions provided by the authors (mainly that there is no cost of reproduction, and that the entire variation in clutch size among individuals of a population is driven by "individual quality").

      We would like to highlight that we do not conclude that there is no cost of reproduction. Please see lines 336–339, where we state that our lack of evidence for trade-offs driving within-species variation in clutch size does not necessarily mean the costs of reproduction are non-existent. We conclude that individuals are constrained to their optima by the survival cost of reproduction. It is also an over-statement of our conclusion to say that we believe that variation in clutch size is only driven by quality. Our results show that unmanipulated birds that have larger clutch sizes also lived longer, and we suggest that this is evidence that some individuals are “better” than others, but we do not say, nor imply, that no other factors affect variation in clutch size. We have added Figure 5 to our manuscript to help the reader better understand what questions we can answer with our study and what conclusions we can draw from our results.

      I write that I am not sure, because in its current form, the manuscript does not contain a single equation, making it impossible to assess. It would need at least a set of mathematical descriptions for the statistical analysis and for the mechanistic model that the authors infer from it.

      We appreciate this comment, and have explained our methods in terms that are accessible to a wider audience. Note, however, that our meta-analysis is standard and based on logistic regression and standard meta-analytic practices. We have added the model formula to the model output tables.

      For the simulation, we simply simulated the resulting effects. We of course supplied our code for this along with our manuscript (https://doi.org/10.5061/dryad.q83bk3jnk), though as we mentioned above, we believe this was not shared with the reviewers despite us making this available for the review process. We therefore understand why the reviewer feels the simulations were not explained thoroughly. We have revised our methods section and added details which we believe make our methodology more clear without needing to consult the supplemental material. However, we have also added the equations used in the process of calculating our simulated data to the Supplementary Information for readers who wish to have this information in equation form.

      The texts mixes concepts of individual vs population statistics, of within individual vs among-individuals measures, of allocation trade-offs and fitness trade-offs, etc ....which means it would also require a glossary of the definitions the authors use for these various terms, in order to be evaluated.

      We would like to thank the reviewer for highlighting this lack of clarity in our text. Throughout the manuscript we have refined our terminology and indicated where we are referring to the individual level or the population level. The inclusion of our new Figure 5 (decision tree) should also help in this context, as it is clear on which level we base our interpretation and conclusions on.

      This problem is emphasised by the following sentence to be found in the discussion "The effect of birds having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation". The "effect" is defined as the survival rate (see Fig 1). While it is relatively easy to intuitively understand what the "effect" is for the unmanipulated studies: the sensitivity of survival to clutch size at the population level, this should be mentioned and detailed in a formula. Moreover, the concept of effect size is not at all obvious for the manipulated ones (effect of the manipulation? or survival rate whatever the manipulation (then how could it measure a trade-off ?)? at the population level? at the individual level ?) despite a whole appendix dedicated to it. This absolutely needs to be described properly in the manuscript.

      Thank you for identifying this sentence for which the writing was ambiguous, our apologies. We have now rewritten this and included additional explanation. L282-290: ‘The effect on parental annual survival of having naturally larger clutches was significantly opposite to the result of increasing clutch size through brood manipulation, and quantitatively similar. Parents with naturally larger clutches are thus expected to live longer and this counterbalances the “cost of reproduction” when their brood size is experimentally manipulated. It is, therefore, possible that quality effects mask trade-offs. Furthermore, it could be possible that individuals that lay larger clutches have smaller costs of reproduction, i.e. would respond less in terms of annual survival to a brood size manipulation, but with our current dataset we cannot address this hypothesis (Figure 5).’

      We would also like to thank the reviewer for bringing to our attention the lack of clarity about the details of our methodology. We have added details to our methodology (see “Extracting effect sizes” section) to address this (see highlighted sections). For clarity, the effect size for both manipulated and unmanipulated nests was survival, given the brood size raised. We performed a logistic regression with survival as a binary response variable (i.e., number of individuals that survived and number of individuals that died after each breeding season), and clutch size was the absolute value of offspring in the nest (i.e., for a bird that laid a clutch size of 5 but was manipulated to have -1 egg, we used a clutch size value of 4). This allows for direct comparison of the effect size (survival given clutch size raised) between manipulated and unmanipulated birds.

      Despite the lack of information about the underlying mechanistic model tested and the statistical model used, my impression is still that the interpretation in the introduction and discussion is not granted by the outputs of the figures and tables. Let's use a model similar to that of (van Noordwijk and de Jong, 1986): imagine that the mechanism at the population level is

      a.c_(i,q)+b.s_(i,q)=E_q

      Where c_(i,q) are s_(i,q) are respectively the clutch size for individual i which is of quality q, and E_q is the level of "energy" that an individual of quality q has available during the given time-step (and a and b are constants turning the clutch size and survival rate into energy cost of reproduction and energy cost of survival, and there are both quite "high" so that an extra egg (c_(i,q) is increased by 1) at the current time-step, decreases s_(i,q) markedly (E_q is independent of the number of eggs produced), that is, we have strong individual costs of reproduction). Imagine now that the variance of c_(i,q) (when the population is not manipulated) among individuals of the same quality group, is very small (and therefore the variance of s_(i,q) is very small also) and that the expectation of both are proportional to E_q. Then, in the unmanipulated population, the variance in clutch size is mainly due to the variance in quality. And therefore, the larger the clutch size c_(i,q) the higher E_q, and the higher the survival s_(i,q).

      In the manipulated populations however, because of the large a and b, an artificial increase in clutch size, for a given E_q, will lead to a lower survival s_(i,q). And the "effect size" at the population level may vary according to a,b and the variances mentioned above. In other words, the costs of reproduction may be strong, but be hidden by the data, when there is variance in quality; however there are actually strong costs of reproduction (so strong actually that they are deterministic and that the probability to survive is a direct function of the number of eggs produced)

      We would like to thank the reviewer for these comments. We have added detail to our methodology section so our models and rationale are more clear. Please note that our simulations only take the experimental effect of brood size on parental survival into account. Our model does not incorporate quality effects. The reviewer is right that the relationship between quality and the effects exposed by manipulating brood size can take many forms and this is a very interesting topic, but not one we aimed to tackle in our manuscript. In terms of quality we make two points: (1) overall quality effects connecting reproduction and parental survival are present, (2) these effects are opposite in direction to the effects when reproduction is manipulated and similar in magnitude. We do not go further than that in interpreting our results. The reviewer is correct, however, that we do suggest and repeat suggestions by others that quality can also mask the trade-off in some individuals or circumstances (L74-76, L95-98 & L286-289), but we do not quantify this, as it is dependent on the unknown relationship between quality and the response to the manipulation. A focussed set of experiments in that context would be interesting and there are some data that could get at this, i.e. the relationship between produced clutch size and the relative effect of the manipulation (now included L287-290). Such information is, however, not available for all studies and, although we explored the possibility of analysing this, currently this is not possible with adequate confidence and there is the possible complexity of non-linear effects. We have added this rationale in our revision (L259-265).

      Moreover, it seems to me that the costs of reproduction are a concept closely related to generation time. Looking beyond the individual allocative (and other individual components of the trade-off) cost of reproduction and towards a populational negative relationship between survival and reproduction, we have to consider the intra-population slow fast continuum (some types of individuals survive more and reproduce less (are slower) than other (which are faster)). This continuum is associated with a metric: the generation time. Some individuals will produce more eggs and survive less in a given time-period because this time-period corresponds to a higher ratio of their generation time (Gaillard and Yoccoz, 2003; Gaillard et al., 2005). It seems therefore important to me, to control for generation time and in general to account for the time-step used for each population studied when analysing costs of reproduction. The data used in this manuscript is not just clutch size and survival rates, but clutch size per year (or another time step) and annual (or other) survival rates.

      The reviewer is right that this is interesting. There is a longstanding unexplained difference in temperate (seasonal) and tropical reproductive strategies. Most of our data come from seasonal breeders, however. Although there is some variation in second brooding and such, these species mostly only produce one brood. We do agree that a wider consideration here is relevant, but we are not trying to explain all of life history in our paper. It is clearly the case that other factors will operate and the opportunity for trade-offs will vary among species according to their respective life histories. However, our study focuses on the two most fundamental components of fitness – longevity and reproduction – to test a major hypothesis in the field, and we uncover new relationships that contrast with previous influential studies and cast doubt on previous conclusions. We question the assumed trade-off between reproduction and annual survival. We show that quality is important and that the effect we find in experimental studies is so small that it can only explain between-species patterns but is unlikely to be the selective force that constrains reproduction within species. We do agree that there is a lot more work that can be done in this area. We hope we are contributing to the field, by questioning this central trade-off. We have incorporated some of the reviewers suggestions in the revision (L309-315). We have added Figure 5 to make clear where we are able to reach solid conclusions and the evidence on which these are based as clearly as possible in an easily accessible format.

      Finally, it is important to relate any study of the costs of reproduction in a context of individual heterogeneity (in quality for instance), to the general problem of the detection of effects of individual differences on survival (see, e.g., Fay et al., 2021). Without an understanding of the very particular statistical behaviour of survival, associated to an event that by definition occurs only once per life history trajectory (by contrast to many other traits, even demographic, where the corresponding event (production of eggs for reproduction, for example) can be measured several times for a given individual during its life history trajectory).

      Thank you for raising this point. The reviewer is right that heterogeneity can dampen or augment selection. Note that by estimating the effect of quality here we give an example of how heterogeneity can possibly do exactly this. We thank the reviewer for raising that we should possibly link this to wider effects of heterogeneity and we have added to our discussion of how our results play into the importance of accounting for among-individual heterogeneity (L252-256).

      References:

      Fay, R. et al. (2021) 'Quantifying fixed individual heterogeneity in demographic parameters: Performance of correlated random effects for Bernoulli variables', Methods in Ecology and Evolution, 2021(August), pp. 1-14. doi: 10.1111/2041-210x.13728.

      Gaillard, J.-M. et al. (2005) 'Generation time: a reliable metric to measure life-history variation among mammalian populations.', The American naturalist, 166(1), pp. 119-123; discussion 124-128. doi: 10.1086/430330.

      Gaillard, J.-M. and Yoccoz, N. G. (2003) 'Temporal Variation in Survival of Mammals: a Case of Environmental Canalization?', Ecology, 84(12), pp. 3294-3306. doi: 10.1890/02-0409.

      van Noordwijk, A. J. and de Jong, G. (1986) 'Acquisition and Allocation of Resources: Their Influence on Variation in Life History Tactics', American Naturalist, p. 137. doi: 10.1086/284547.

      Reviewer #3 (Public Review):

      The authors present here a comparative meta-analysis analysis designed to detect evidence for a reproduction/ survival trade-off, central to expectations from life history theory. They present variation in clutch size within species as an observation in conflict with expectations of optimisation of clutch size and suggest that this may be accounted for from weak selection on clutch size. The results of their analyses support this explanation - they found little evidence of a reproduction - survival trade-off across birds. They extrapolated from this result to show in a mathematical model that the fitness consequences of enlarged clutch sizes would only be expected to have a significant effect on fitness in extreme cases, outside of normal species' clutch size ranges. Given the centrality of the reproduction-survival trade-off, the authors suggest that this result should encourage us to take a more cautious approach to applying concepts the trade-off in life history theory and optimisation in behavioural ecology more generally. While many of the findings are interesting, I don't think the argument for a major re-think of life history theory and the role of trade-offs in fitness maximisation is justified.

      The interest of the paper, for me, comes from highlighting the complexities of the link between clutch size and fitness, and the challenges facing biologists who want to detect evidence for life history trade-offs. Their results highlight apparently contradictory results from observational and experimental studies on the reproduction-survival trade-off and show that species with smaller clutch sizes are under stronger selection to limit clutch size.

      Unfortunately, the authors interpret the failure to detect a life history trade-off as evidence that there isn't one. The construction of a mathematical model based on this interpretation serves to give this possible conclusion perhaps more weight than is merited on the basis of the results, of this necessarily quite simple, meta-analysis. There are several potential complicating factors that could explain the lack of detection of a trade-off in these studies, which are mentioned and dismissed as unimportant (lines 248-250) without any helpful, rigorous discussion. I list below just a selection of complexities which perhaps deserve more careful consideration by the authors to help readers understand the implications of their results:

      We would like to thank the reviewer for their thoughtful response and summary of the findings that we also agree are central to our study. The reviewer also highlights areas where our manuscript could benefit from a deeper consideration and we have added detail accordingly to our revised discussion.

      We would like to highlight that we do not interpret the failure to detect a trade-off as evidence that there is not one. First, and importantly, we do find a trade-off but show this is only incurred when individuals produce a clutch beyond their optimal level. Second, we also state on lines 322-326 that the lack of evidence to support trade-offs being strong enough to drive variation in clutch size does not necessarily mean there are no costs of reproduction.

      The statement that we have constructed a mathematical model based on the interpretation that we have not found a trade-off is, again, factually incorrect. We ran these simulations because the opposite is true – we did find a trade-off. There is a significant effect of clutch size when manipulated on annual parental survival. We benefit from our unique analysis allowing for a quantitative fitness estimate from the effect size on annual survival (as this is expressed on a per-egg basis). This allowed us to ask whether this quantitative effect size can alone explain why reproduction is constrained, and we evaluate this using simulations. From these simulations we find that this effect size is too small to explain the constraint, so something else must be going on, and we do spend a considerable amount of text discussing the possible explanations (L202-215). Note that the possibly most parsimonious conclusion here is that costs of reproduction are not there, or simply small, so we also give that explanation some thought (L221-224 and L315-331).

      We are disappointed by the suggestion that we have dismissed complicating factors that could prevent detection of a trade-off, as this was not our intention. We were aiming to highlight that what we have demonstrated to be an apparent trade-off can be explained through other mechanisms, and that the trade-off between clutch size and survival is not as strong in driving within-species variation in clutch size as previously assumed. We have added further discussion to our revised manuscript to make this clear and give readers a better understanding of the complexity of factors associated with life-history theory, including the addition of a decision tree (Figure 5).

      • Reproductive output is optimised for lifetime reproductive success and so the consequences of being pushed off the optimum for one breeding attempt are not necessarily detectable in survival but in future reproductive success (and, therefore, lifetime reproductive success).

      We agree this is a valid point, which is mentioned in our manuscript in terms of alternative stages where the costs of reproduction might be manifested (L316-320). We would also like to highlight that , in our simulations, the change in clutch size (and subsequent survival cost) was assumed for the lifetime of the individual, for this very reason.

      • The analyses include some species that hatch broods simultaneously and some that hatch sequentially (although this information is not explicitly provided (see below)). This is potentially relevant because species which have been favoured by selection to set up a size asymmetry among their broods often don't even try to raise their whole broods but only feed the biggest chicks until they are sated; any added chicks face a high probability of starvation. The first point this observation raises is that the expectation of more chicks= more cost, doesn't hold for all species. The second more general point is that the very existence of the sequential hatching strategy to produce size asymmetry in a brood is very difficult to explain if you reject the notion of a trade-off.

      We agree with the reviewer that the costs of reproduction can be absorbed by the offspring themselves, and may not be equal across offspring (we also highlight this at L317-318 in the manuscript). However, we disagree that for some species the addition of more chicks does not equate to an increase in cost, though we do accept this might be less for some species. This is, however, difficult to incorporate into a sensible model as the impacts will vary among species and some species do also exhibit catch-up growth. So, without a priori knowledge on this, we kept our model simple to test whether the effect on parental survival (often assumed to be a strong cost) can explain the constraint on reproductive effort, and we conclude that it does not.

      We would also like to make clear that we are not rejecting the notion of a trade-off. Our study shows evidence that a trade-off between survival and reproductive effort probably does not drive within-species variation in clutch size. We do explicitly say this throughout our manuscript, and also provide suggestions of other areas where a trade-off may exist (L317-320). The point of our study is not whether trade-offs exist or not, it is whether there is a generalisable across-species trend for a trade-off between reproductive effort and survival – the most fundamental trade-off in our field but for which there is a lack of conclusive evidence within species. We believe the addition of Figure 5 to our reviewed manuscript also makes this more evident.

      • For your standard, pair-breeding passerine, there is an expectation that costs of raising chicks will increase linearly with clutch size. Each chick requires X feeding visits to reach the required fledge weight. But this is not the case for species which lay precocious chicks which are relatively independent and able to feed themselves straight after hatching - so again the relationship of care and survival is unlikely to be detectable by looking at the effect of clutch size but again, it doesn't mean there isn't a trade-off between breeding and survival.

      Precocial birds still provide a level of parental care, such as protection from predators. Though we agree that the level of parental care in provisioning food (and in some cases in all parental care given) is lower in precocial than altricial birds, this would only make our reported effect size for manipulated birds to be an underestimate. Again, we would like to draw the reviewer’s attention to the fact we did detect a trade-off in manipulated birds and we do not suggest that trade-offs do not exist. The argument the reviewer suggests here does not hold for unmanipulated birds, as we found that birds that naturally lay larger clutch sizes have higher survival.

      • The costs of raising a brood to adulthood for your standard pair-breeding passerine is bound to be extreme, simply by dint of the energy expenditure required. In fact, it was shown that the basal metabolic rate of breeding passerines was at the very edge of what is physiologically possible, the human equivalent being cycling the Tour de France (Nagy et al. 1990). If birds are at the very edge of what is physiologically possible, is it likely that clutch size is under weak selection?

      If birds are at the very edge of what is physiologically possible, then indeed it would necessarily follow that if they increase the resource allocated in one area then expenditure in another area must be reduced. In many studies, however, the overall brood mass is increased when chicks are added and cared for in an experimental setting, suggesting that birds are not operating at their limit all the time. Our simulations show that if individuals increase their clutch size, the survival cost of reproduction counterbalances the fitness gained by increasing clutch size and so there is no overall fitness gain to producing more offspring. Therefore, selection on clutch size is constrained to the within-species level. We do not say in our manuscript that clutch size is under weak selection – we only ask why variation in clutch size is maintained if selection always favours high-producing birds.

      • Variation in clutch size is presented by the authors as inconsistent with the assumption that birds are under selection to lay the Lack clutch. Of course, this is absurd and makes me think that I have misunderstood the authors' intended point here. At any rate, the paper would benefit from more clarity about how variable clutch size has to be before it becomes a problem for optimality in the authors' view (lines 84-85; line 246). See Perrins (1965) for an exquisite example of how beautifully great tits optimise clutch size on average, despite laying between 5-12 eggs.

      We thank the reviewer for highlighting that our manuscript may be misleading in places, however, we are unsure which part of our conclusions the author is referring to here. The question we pose is “Why don’t all birds produce a clutch size at the population optimum?”, and is central to the decades-long field of life-history theory. Why is variation maintained? As the reviewer outlines, there is extensive variability, with some birds laying half of what other birds lay.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Title: while the costs of reproduction are possibly important in shaping optimal clutch size, it is not clear what you can about it given that you do not consider clutch / brood size effects on fitness prospects of the offspring.

      We have expanded on our discussion of how some costs may be absorbed by the offspring themselves. However, a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. We have focussed on the relationship between reproductive effort and survival because it is given the most weight in the field in terms of driving intra-specific variation in clutch size. We have altered our title to show we focus on the survival costs specifically: “The optimal clutch size revisited: separating individual quality from the parental survival costs of reproduction”.

      (2) L.11-12: I agree that this is true for birds, but this is phrased more generally here. Are you sure that that is justified?

      The trade-off between survival and reproductive effort has largely been tested experimentally through brood manipulations in birds as this provides a good system in which to test the costs and benefits of increasing parental effort. The work in this area has provided theory beyond just passerine birds, which are the most commonly manipulated group, to across-taxa theories. We are unaware of any study/studies that provide evidence that the reproduction/survival trade-off is generalisable across multiple species in any taxa. As such, we do believe this sentence is justified. An example is the lack of a consistent negative genetic correlation in populations of fruitflies, for example, that has also been hailed as a lack-of-cost paradigm. Furthermore, some mutants that live longer do so without a cost on reproduction.

      (3) L.13-14: Not sure what you mean with this sentence - too much info lacking.

      We have added some detail to this sentence.

      (4) L.14: it is slightly awkward to say 'parental investment and survival' because it is the survival effect that is usually referred to as the 'investment'. Perhaps what you want to say is 'parental effort and survival'?

      We have replaced “parental investment” with “reproductive effort”

      (5) L.15: you can omit 'caused'. Compared to control treatment or to reduced broods? Why not mention effects or lack thereof of brood reduction? And it would be good to also mention here whether effects were similar in the sexes.

      Please see our methodology where we state that we use clutch size as a continuous variable (we do not compare to control or reduced but include the absolute value of offspring in a logistic regression). The effects of a brood reduction are drawn from the same regression and so are opposite. Though we appreciate the detail here is lacking to fully comprehend our study, we would like to highlight this is the abstract and details are provided in the main text.

      (6) L. 15: I am not sure why you write 'however', as the finding that experimental and natural variation have opposite effects is in complete agreement with what is generally reported in the literature and will therefore surprise no one that is aware of the literature.

      We use “however” to highlight the change in direction of the effect size from the results in the previous sentence. We also believe that ours ise the first study that provides a quantitative estimate of this effect and that previous work is largely theoretical. The reviewer states that this is what is generally reported but it is not reported in all cases, as some relationships between reproductive effort and survival are negative (for the quality measurement, in correlational space, see Figure 1).

      (7) L.16: saying 'opposite to the effect of phenotypic quality' seems difficult to justify, as clutch size cannot be equated with phenotypic quality. Perhaps simply say 'natural variation in clutch size'? If that is what you are referring to.

      Please note we are referring to effect sizes here –- that is, the survival effect of a change in clutch size. By phenotypic quality we are referring to the fact that we find higher parental survival when natural clutch sizes are higher. It is not the case that we refer to quality only as having a higher clutch size. This is explicitly stated in the sentence you refer to. We have changed “effect” to “effect size” to highlight this further.

      (8) L.18: why do you refer to 'parental care' here? Brood size is not equivalent to parental care.

      Brood size manipulations are used to manipulate parental care. The effect on parental survival is expected to be incurred because of the increase in parental care. We have changed “parental care” to “reproductive effort” to reduce the number of terms we use in our manuscript.

      (9) L.18-19: suggest to tone down this claim, as this is no more than a meta-analytic confirmation of a view that is (in my view) generally accepted in the field. That does not mean it is not useful, just that it does not constitute any new insight.

      We are unaware of any other study which provides generalisable across-species evidence for opposite effects of quality and costs of reproduction. The work in this area is also largely theoretical and is yet to be supported experimemtally, especially in a quantitative fashion. It is surprising to us that the reviewer considers there to be general acceptance in a field, rather than being influenced by rigorous testing of hypotheses, made possible by meta-analysis, the current gold standard in our field.

      (10) L.21: what does 'parental effort' mean here? You seem to use brood size, parental care, parental effort, and parental investment interchangeably but these are different concepts. Daan et al (1990, Behaviour), which you already cite, provide a useful graph separating these concepts. Please adjust this throughout the manuscript, i.e. replace 'reproductive effort' with wording that reflect the actual variable you use.

      We have not used the phrase “parental effort” in this sentence. We agree these are different concepts but in this context are intertwined. For example, brood size is used to manipulate parental care as a result of increased parental effort. We do agree the manuscript would benefit from keeping terminology consistent throughout the manuscript and have adjusted this throughout.

      (11) L.23: perhaps add 'in birds' somewhere in this sentence? Some reference to the assumptions underlying this inference would also be useful. Two major assumptions being that birds adjusted their effort to the manipulation as they would have done had they opted for a larger brood size themselves, and that the costs of laying and incubating extra eggs can be ignored. And then there is the effect that laying extra eggs will usually delay the hatch date, which in many species reduces reproductive success.

      Though our study does exclusively use birds, birds have been used to test the survival/reproduction trade-off because they present a convenient system in which to experimentally test this. The conclusions from these studies have a broader application than in birds alone. We believe that although these details are important, they are not appropriate in the abstract of our paper.

      (12) L.26: how is this an explanation? It just repeats the finding.

      We intend to refer to all interpretations from all results presented in our manuscript. We have made this more clear by adjusting our writing.

      (13) L.27: I do not see this point. And 'reproductive output' is yet another concept, that can be linked to the other concepts in the abstract in different ways, making it rather opaque.

      We have changed “reproductive output” to “reproductive effort”.

      (14) L.33: here you are jumping from 'resources' to 'energetically' - it is not clear that energy is the only or main limiting resource, so why narrow this down to energy?

      We do not say energy is the only or main limiting resource. We simply highlight that reproduction is energetically demanding and so, intuitively, a trade-off with a highly energetically demanding process would be the focal place to observe a trade off. We have, though, replaced “energetically” with “resource”.

      (15) L.35-36: this is new to me - I am not aware of any such claims, and effects on the residual reproductive value could also arise through effects on future reproduction. The authors you cite did not work on birds, or (in their own study systems) presented results that as far as I remember warrant such a general statement.

      The trade-off between reproduction and survival is seminal to the disposable soma theory, proposed by Kirkwood. Though Kirkwood’s work was largely not focussed on birds, it had fundamental implications for the field of evolutionary ecology because of the generalisable nature of his proposed framework. In particular, it has had wide-reaching influence on how the biology of aging is interpreted. The readership of the journal here is broad, and our results have implications for that field too. The work of Kirkwood (many of the papers on this topic have over 2000 citations each) has been perhaps overly influential in many areas, so a link to how that work should be interpreted is highly relevant. If the reviewer is interested in this topic the following papers by one of the co-authors and others could be of interest, some of which we could not cite in the main manuscript due to space considerations:

      https://www.science.org/doi/pdf/10.1126/sciadv.aay3047

      https://agingcelljournal.org/Archive/Volume3/stochasticity_explains_non_genetic_inheritance_of_lifespan/

      https://pubmed.ncbi.nlm.nih.gov/21558242/

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.13444

      https://www.nature.com/articles/362305a0

      https://www.cell.com/trends/ecology-evolution/fulltext/S0169-5347(12)00147-4

      https://www.cell.com/cell/pdf/S0092-8674(15)01488-9.pdf

      https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-018-0562-z

      (16) L.42: this could be preceded with mentioning the limitations of observational data.

      We have added detail as to why brood manipulations are a good test for trade-offs and so this is now inherently implied.

      (17) L.42-43: why?

      We have added detail to this sentence.

      (18) L.45: do any of the references cited here really support this statement? I am certain that several do not - in these this statement is an assumption rather than something that is demonstrated. It may be useful to look at Kate Lessell's review on this that appeared in Etologia, I think in the 1990's. Mind however that 'reproductive effort' is operationally poorly defined for reproducing birds - provisioning rate is not necessarily a good measure of effort in so far as there are fitness costs.

      We have updated the references to support the sentence.

      (19) L.47: Given that you make this statement with respect to brood size manipulations in birds, it seems to me that the paper by Santos & Nakagawa is the only paper you should cite here. Given that you go on to analyze the same data it deserves to be discussed in more detail, for example to clarify what you aim to add to their analysis. What warrants repeating their analysis?

      Please first note that our dataset includes Santos & Nakagawa and additional studies, so it is not accurate to say we analyse the same data. Furthermore, we believe our study has implications beyond birds alone and so believe it is appropriate to cite the papers that do support our statement. We have added details to the methods to explicitly state what data is gathered from Santos & Nakagawa (it is only used to find the appropriate literature and data was re-extracted and re-analysed in a more appropriate way) and, separately, how we gathered the observational studies (see L352-381).

      (20) L.48: There are more possible explanations to this, which deserve to be discussed. For example, brood size manipulations may not have been that effective in manipulating reproductive effort - for example, effects on energy expenditure tend to be not terribly convincing. Secondly, the manipulations do not affect the effort incurred in laying eggs (which also biases your comparison with natural variation in clutch size). Thirdly, the studies by Boonekamp et al on Jackdaws found that while there was no effect of brood size manipulation on parental survival after one year of manipulation, there was a strong effect when the same individuals were manipulated in the same direction in multiple years. This could be taken to mean that costs are not immediate but delayed, explaining why single year manipulations generally show little effect on survival. It would also mean that most estimates of the fitness costs of manipulated brood size are not fit for purpose, because typically restricted to survival over a single year.

      Please see our response to this comment in the public reviews.

      Out of interest and because the reviewer mentioned “energy expenditure” specifically: There are studies that show convincing effects of brood size manipulation on parental energy expenditure. We do agree that there are also studies that show ceilings in expenditure. We therefore disagree that they “tend to be not terribly convincing”. Just a few examples:

      https://academic.oup.com/beheco/article/10/5/598/222025 (Figure 2)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/1365-2435.12321 (Figure 1)

      https://besjournals.onlinelibrary.wiley.com/doi/full/10.1046/j.1365-2656.2000.00395.x (but ceiling at enlarged brood).

      (21) L.48, "or, alternatively, that individuals may differ in quality": how do you see that happening when brood size is manipulated, and hence 'quality' of different experimental categories can be assumed to be approximately equal? This point does apply to observational studies, so I assume that that is what you had in mind, but that distinction should be clear (also on line 54).

      We have made it more clear that we determine if there are quality effects separate to the costs of reproduction found using brood manipulation studies.

      (22) L.50: Drent & Daan, in their seminal paper on "The prudent parent" (1980, Ardea) were among the earliest to make this point and deserve to be cited here.

      We have added this citation

      (23) L.51, "relative importance": relative to what? Please be more specific.

      We have adjusted this sentence.

      (24) L.54: Vedder & Bouwhuis (2018, Oikos) go some way towards this point and should be explicitly mentioned with reference to the role of 'quality' effects on the association between reproductive output and survival.

      We have added this reference.

      (25) L.55: can you be more specific on what you want to do exactly? What you write here could be interpreted differently.

      We have added an explicit aim after this sentence to be more clear.

      (26) L.57: Here also a more specific wording would be useful. What does it mean exactly when you say you will distinguish between 'quality' and 'costs'?

      We have added detail to this sentence.

      (27) L.62: it should be clearer from the introduction that this is already well known, which will indirectly emphasize what you are adding to what we know already.

      We would argue this is not well known and has only been theorised but not shown empirically, as we do here.

      (28) L.62: you equate clutch size with 'quality' here - that needs to be spelled out.

      We refer to quality as the positive effect size of survival for a given clutch size, not clutch size alone. We appreciate this is not clear in this sentence and have reworded.

      (29) L.64: this looks like a serious misunderstanding to me, but in any case, these inferences should perhaps be left to the discussion (this also applies to later parts of this paragraph), when you have hopefully convinced readers of the claims you make on lines 62-63.

      We are unsure of what the reviewer is referring to as a misunderstanding. We have chosen this format for the introduction to highlight our results. If this is a problem for the editors we will change as required.

      (30) L.66: quantitative comparison of what?

      Comparison of species. We have changed the wording of this sentence

      (31) L.67-69: this should be in the methods.

      We have used a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (32) L.74-88: suggest to (re)move this entire paragraph, presenting inferences in such an uncritical manner before presenting the evidence is inappropriate in my view. I have therefore refrained from commenting on this paragraph.

      We have chosen a modern format which highlights our result. We are happy to change the format should the editors wish us to.

      (33) L.271, "must detail variation in the number of raised young": it is not sufficiently clear what this means - what does 'detail' mean in this context? And what does 'number of raised young' mean? The number hatched or raised to fledging?

      We have now made this clear.

      (34) L271, "must detail variation in the number of raised young": looking at table S4, it seems that on the basis of this criterion also brood size manipulation studies where details on the number of young manipulated were missing are excluded. I see little justification for this - surely these manipulations can for example be coded as for example having the average manipulation size in the meta-analysis data set, thereby contributing to tests of manipulation effects, but not to variation within the manipulation groups?

      We have done in part what the reviewer describes. We are specifically interested in the manipulation size, so we required this to compare effect sizes across species and categories, a key advance of our study and outlined in many places in our manuscript. Note, however, that we only need comparative differences, and have used clutch size metrics more generally to obtain a mean clutch size for a species, as well as SD where required. Please also note that our supplement details exactly why studies were excluded from our analysis, as is the preferred practice in a meta-analysis.

      (35) L.271, "referred to as clutch size": the point of this simplification is not clear to me why it is clearly confusing - why not refer to 'brood size' instead?

      Brood size and clutch size can be used interchangeably here because, in the observational studies, the individuals vary in the number of eggs produced, whereas for brood manipulations this obviously happens after hatching and brood is perhaps a more appropriate term, but we wanted to simplify the terminology used. However, we use clutch size throughout as the aim of our study is to determine why individuals differ in the number of offspring they produce, and so clutch size is the most appropriate term for that.

      (36) L.280: according to the specified inclusion criteria (lines 271/272) these studies should already be in the data set, so what does this mean exactly?

      Selection criteria refers to whether a given study should be kept for analysis or not. It does not refer to how studies were found. Please see lines 361-378 for details on how we found studies (additional details are also in the Supplementary Methods).

      (37) L.281: the use of 'quality' here is misleading - natural variation in clutch or brood size will have multiple causes, variation in phenotypic quality of the individuals and their environment (territories) is only one of the causes. Why not simply refer to what you are actually investigating: natural and experimental variation in brood size.

      We disagree, our study aims to separate quality effects from the costs of reproduction and we use observational studies to test for quality differences, though we make no inference about the mechanisms. We do not imply that the environment causes differences in quality, but that to directly compare observation and experimental groups, they should contain similar species. So, to be clear again, quality refers to the positive covariation of clutch size with survival. We feel that we explain this clearly in our study’s rationale and have also improved our writing in several sections on this to avoid any confusion (see responses to earlier comments by the three reviewers).

      (38) L.283, "in most cases": please be exact and say in xx out xx cases.

      We have added the number of studies for each category here.

      (39) L.283-285: presumably readers can see this directly in a table with the extracted data?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Though we do believe all readers should have access to this information if they wish and so is publicly available.

      (40) L.293: there does not seem to be a table that lists the included studies and effect sizes. It is not uncommon to find major errors in such tables when one is familiar with the literature, and absence of this information impedes a complete assessment of the manuscript.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data are too large to include as a table in the main text and are not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      (41) L.293: from how many species?

      We have added this detail.

      (42) L.296, "longevity": this is a tricky concept, not usually reported in the studies you used, so please describe in detail what data you used.

      We have removed longevity as we did not use this data in our current version of the manuscript.

      (43) L. 298: again: where can I see this information?

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers.

      (44) L. 304, "we used raw data": I assume that for the majority of papers the raw data were not available, so please explain how you dealt with this. Or perhaps this applies to a selection of the studies only? Perhaps the experimental studies?

      By raw data, we mean the absolute value of offspring in the nest. We have changed the wording of this sentence and added detail about whether the absolute value of offspring was not present for brood manipulation studies (L393-397).

      (45) L.304: When I remember correctly, Santos and Nakagawa examined effects of reducing and enlarging brood size separately, which is of importance because trade-off curves are unlikely to be linear and whether they are or not has major effects on the optimization process. But perhaps you tackled this in another way? I will read on.....

      You are correct that Santos & Nakagawa compared brood increases and reductions to control separately. Note that this only partially accounts non-linearity and it does not take into account the severity of the change in brood size. By using a logistic regression of absolute clutch size, as we have done, we are able to directly compare brood manipulations with experimental studies. Please see Supplementary Methods lines 11-12, where we have added additional detail as to why our approach is beneficial in this analysis.

      (46) L.319: what are you referring to exactly with "for each clutch size transformation"?

      We refer to the raw, standardised and proportional clutch size transformations. We have added detail here to be more clear.

      (47) L.319: is there a cost of survival? Perhaps you mean 'survival cost'? This would be appropriate for the experimental data, but not for the observational data, where the survival variation may be causally unrelated to the brood size variation, even if there is a correlation.

      We have changed “cost of survival” to “effect of parental survival”. We only intend to imply causality for the experimental studies. For observational studies we do not suggest that increasing clutch size is causal for increasing survival, only correlative (and hence we use the phrase “quality”).

      (48) L.320: please replace "parental effort" with something like 'experimental change in brood size'.

      We have changed “parental effort” to “reproductive effort”

      (49) L.321: due to failure of one or more eggs to hatch, and mortality very early in life, before brood sizes are manipulated, it is not likely that say an enlargement of brood size by 1 chick can be equated to the mean clutch size +1 egg / check. For example, in the Wytham great tit study, as re-analysed by Richard Pettifor, a 'brood size manipulation' of unmanipulated birds is approximately -1, being the number of eggs / chicks lost between laying and the time of brood size manipulation. Would this affect your comparisons?

      Though we agree these are important factors in determining what a clutch/brood size actually is for a given individual/pair, as this can vary from egg laying to fledging. We do not believe that accounting for this (if it was possible to do so) would significantly affect our conclusions, as observational studies are comparable in the fact that these birds would also likely see early life mortality of their offspring. It is also possibly the case that parents already factor in this loss, and so a brood manipulation still changes the parental care effort an individual has to incur.

      (50) L.332: instead of "adjusted" perhaps say 'mean centred'?

      We have implemented this suggestion.

      (51) L.345: this statement surprised me, but is difficult to verify because I could not locate a list of the included studies. However, to my best knowledge, most studies reporting brood size manipulation effects on parental survival had this as their main focus, in contrast to your statement.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal, although supplied by us on several occasions. We regret that the reviewer was impeded by this unfortunate communication failure, but we did our best to make the data available to the reviewers during the initial review process.

      (52) L.361-362: this seems a realistic approach from an evolutionary perspective, but we know from the jackdaw study by Boonekamp that the survival effect of brood size manipulation in a single year is very different from the survival effect of manipulating as in your model, i.e. every year of an individual's life the same manipulation. For very short-lived species this possibly does not make much difference, but for somewhat longer-lived species this could perhaps strongly affect your results. This should be discussed, and perhaps also explored in your simulations?

      Note that the Boonekamp study does not separate whether the survival effects are additive or

      multiplicative. As such, we do not know whether the survival effects for a single year manipulation are just small and hard to detect, or whether the survival effects are multiplicative. Our simulations assumed that the brood enlargement occurred every year throughout their lives. We have added some text to the discussion on the point you raise.

      (53) L.360: what is "lifetime reproductive fitness"? Is this different from just "fitness"?

      We have changed “lifetime reproductive fitness” to “lifetime reproductive output”.

      (54) L.363: when you are interested in optimal clutch size, why not also explore effects of reducing clutch size?

      As we find that a reduction in clutch size leads to a reduction in survival (for experimental studies), we already know that these individuals would have a reduced fitness return compared to reproducing at their normal level, and so we would not learn anything from adding this into our simulations. The interest in using clutch size enlargements is to find out why an individual does not produce more offspring than it does, and the answer is that it would not have a fitness benefit (unless its clutch size and survival rate combination is out of the bounds of that observable in the wild).

      (55) Fig.1 - using 'parental effort' in the y-axis label is misleading, suggest to replace with e.g. "clutch or brood size". Using "clutch size" in the title is another issue, as the experimental studies typically changed the number of young rather than the number of eggs.

      We have updated the figure axes to say “clutch size” rather than “parental effort”. Please see response to comment 35 where we explain our use of the term “clutch size” throughout this manuscript.

      (56) L.93 - 108: I appreciate the analysis in Table 1, in particular the fact that you present different ways of expressing the manipulation. However, in addition, I would like to see the results of an analysis treating the manipulations as factor, i.e. without considering the scale of the manipulation. This serves two purposes. Firstly, I believe it is in the interest of the field that you include a detailed comparison with the results of Santos & Nakagawa's analysis of what I expect to be largely the same data (manipulation studies only - for this purpose I would also like to see a comparison of effect size between the sexes). Secondly, there are (at least) two levels of meta-analysis, namely quantifying an overall effect size, and testing variables that potentially explain variation in effect size. You are here sort of combining the two levels of analysis, but including the first level also would give much more insight in the data set.

      Our main intention here was to improve on how the same hypothesis was approached by Santos & Nakagawa. We did this by improving our analysis (on a by “egg” basis) and by adding additional studies (i.e. more data). In this process mistakes are corrected (as we re-extracted all data, and did not copy anything across from their dataset – which was used simply to ensure we found the same papers); more recent data were also added, including studies missed by Santos & Nakagawa. This means that the comparison with Santos & Nakagawa becomes somewhat irrelevant, apart from maybe technical reasons, i.e. pointing out mistakes or limitations in certain approaches. We would not be able to pinpoint these problems clearly without considering the whole dataset, yet Santos & Nakagawa only had a small subset of the data that were available to us. In short, meta-analysis is an iterative process and similar questions are inevitably analysed multiple times and updated. This follows basic meta-analytic concepts and Cochrane principles. Except where there is a huge flaw in a prior dataset or approach (like we sometimes found and highlighted in our own work, e.g. Simons, Koch, Verhulst 2013, Aging Cell), in itself a comparison of the kind the reviewer suggests distracts from the biology. With the dataset being made available others can make these comparisons, if required. On the sex difference, we provide a comparison of effect sizes separated between both sexes and mixed sex in Table S2 and Figure S1.

      (57) L.93 - 108: a thing that does not become clear from this section is whether experimentally reducing brood size affects parental survival similarly (in absolute terms) as enlarging brood size. Whether these effects are symmetric is biologically important, for example because of its effect on clutch size optimization. In the text you are specific about the effects of increasing brood size, but the effect you find could in theory be due entirely to brood size reduction.

      We have added detail to make it clear that a brood reduction is simply the opposite trend. We use linear relationships because they serve a good approximation of the trend and provide a more rigorous test for an underlying relationship than would fitting nonlinear models. For many datasets there is not a range of chicks added for which a non-linear relationship could be estimated. The question also remains of what the shape of this non-linear relationship should be and is hard to determine a priori.

      We have added some discussion on this to our manuscript (L278-282), in response to an earlier comment.

      (58) L.103-107: this is perhaps better deferred to the discussion, because other potential explanations should also be considered. For example, there have been studies suggesting that small birds were provisioning their brood full time already, and hence had no scope to increase provisioning effort when brood size was experimentally increased.

      We agree this is a discussion point but we believe it also provides an important context for why we ran our simulations, and so we believe this is best kept brief but in place. We agree the example you give is relevant but believe this argument is already contained in this section. See line 121-123 “...suggesting that costs to survival were only observed when a species was pushed beyond its natural limits”.

      (59) L.103-107: this discussion sort of assumes that the results in Table 1 differ between the different ways that the clutch/brood size variation is expressed. Is there any statistical support for this assumption?

      We are unsure of what the reviewer means here exactly. Note that in each of the clutch size transformations, experimental and observational effect sizes are significantly opposite. For the proportional clutch size transformation, experimental and observation studies are both separately significantly different from 0.

      (60) L.104: at this point, I would like to have better insight into the data set. Specifically, a scatter plot showing the manipulation magnitude (raw) plotted against control brood size would be useful.

      Our data and code can be accessed with the following link: https://doi.org/10.5061/dryad.q83bk3jnk. We did supply this information when we submitted our manuscript and again during the review process but we believe this was not passed onto the reviewers by the journal.

      Thank you for this suggestion: this is a useful suggestion also to illustrate how manipulations are relatively stronger for species with smaller clutches, in line with our interpretation of the result presented in Figure 2. We have added Figure S1 which shows the strength of manipulation compared to the species average.

      (61) L. 107: this seems a bold statement - surely you can test directly whether effect size becomes disproportionally stronger when manipulations are outside the natural range, for example by including this characterization as a factor in the models in Table 1.

      It is hard to define exactly what the natural range is here, so it is not easy to factorise objectively, which is why we chose not to do this. However, it is clear that for species with small clutches the manipulation itself is often outside the natural range. Thank you for your suggestion to include a figure for this as it is clear manipulations are stronger in species with smaller clutches. We attribute this to species being forced outside their natural range. We consider our wording makes it clear that this is our interpretation of our findings and we therefore do not think this is a bold statement, especially as it fits with how we interpret our later simulations.

      (62) Fig.3, legend: the term 'node support' does not mean much to me, please explain.

      Node support is a value given in phylogenetic trees to dictate the confidence of a branch. In this case, values are given as a percentage and so can translate to how many times out of 100 the estimate of the phylogeny gives the same branching. Our values are low, as we have relatively few species in our meta-analysis.

      (63) Fig.3: it would be informative when you indicate in this figure whether the species contributed to the experimental or the observational data set or both.

      We have added into Fig 3 whether the species was observational, experimental or both.

      (64) L.139: the p-value refers to the interaction between species clutch size and treatment (observational vs. experimental), but it appears that no evidence is presented for the correlation being significant in either observational or experimental studies.

      We agree that our reporting of the effect size could be misinterpreted and have added detail here. The statistic provided describes the slopes are significantly different between observational and experimental, implying there are differences between the slopes of small and large clutch-laying species.

      (65) L.140: I am wondering to what extent these correlations, which are potentially interesting, are driven by the fact that species average clutch size was also used when expressing the manipulation effect. In other words, to what extent is the estimate on the Y-axis independent from the clutch size on the X-axis? Showing that the result is the same when using survival effect sizes per manipulation category would considerably improve confidence in this finding.

      We are unsure what the reviewer means by “per manipulation category”. Please also note that we have used a logistic regression to calculate our effect sizes of survival, given a unit increase in reproductive effort. So, for example, if a population contained birds that lay 2,3 or 4 eggs, provided that the number of birds which survived and died in each category did not change, if we changed the number of eggs raised to 10,11 or 12, respectively, then our effect size would be the same. In this way, our effect sizes are independent of the species’ average clutch size.

      (66) L.145: when I remember correctly, Santos & Nakagawa considered brood size reduction and enlargement separately. Can this explain the contrasting result? Please discuss.

      You are correct, in that Santos & Nakagawa compared reductions and enlargements to controls separately. However, we found some mistakes in the data extracted by Santos & Nakagawa that we believe explain the differences in our results for sex-specific effect sizes. We do not feel that highlighting these mistakes in the main text is fair, useful or scientifically relevant, as our approach is to improve the test of the hypothesis.

      (67) L.158-159: looking at table S2 it seems to me you have a whole range of estimates. In any case, there is something to be said for taking the estimates for females because it is my impression (and experience) that clutch size variation in most species is a sex-linked trait, in that clutch size tends to be repeatable among females but not among males.

      We agree that, in many cases, the female is the one that ultimately decides on the number of chicks produced. We did also consider using female effect sizes only, however, we decided against this for the following reasons: (1) many of the species used in our meta-analysis exhibit biparental care, as is the case for many seabirds, and so using females only would bias our results towards species with lower male investment; in our case this would bias the results towards passerine species. (2) it has also been shown that, as females in some species are operating at their maximum of parental care investment, it is the males who are able to adjust their workload to care for extra offspring. (3) we are ultimately looking at how many offspring the breeding adults should produce, given the effort it costs to raise them, and so even if the female chooses a clutch size completely independently of the male, it is still the effort of both parents combined that determines whether the parents gain an overall fitness benefit from laying extra eggs. (4) some studies did not clearly specify male or female parental survival and we would not want to reduce our dataset further.

      (68) L.158-168: please explain how you incorporated brood size effects on the fitness prospects of offspring, given that it is a very robust finding of brood size manipulation studies that this affects offspring growth and survival.

      We would argue this is near-on impossible to incorporate into our simulations. It is unrealistic to suggest that incorporating offspring growth into our simulations would add insight, as a change in offspring number rarely affects all offspring in the nest equally and there can even be quite stark differences; for example, this will be most evident in species that produce sacrificial offspring. This effect will be further confounded by catch-up growth, for example, and so it is likely that increased sibling competition from added chicks alters offspring growth trajectories, rather than absolute growth as the reviewer suggests. There are mixed results in the literature on the effect of altering clutch size on offspring survival, with an increased clutch size through manipulation often increasing the number of recruits from a nest. It would be interesting, however, to explore this further using estimates from the literature, but this is beyond our current scope, and would in our initial intuition not be very accurate. It would be interesting to explore how big the effect on offspring should be to constrain effect size strongly. Such work would be more theoretical. The point of our simple fitness projections here is to aid interpretation of the quantitative effect size we estimated.

      (69) L.163: while I can understand that you select the estimate of -0.05 for computational reasons, it has enormous confidence intervals that also include zero. This seems problematic to me. However, in the simulations, you also examined the results of selecting -0.15, which is close to the lower end of the 95% C.I., which seems worth mentioning here already.

      Thank you for this suggestion. Yes, indeed, our range was chosen based on the CI, and we have now made this explicit in the manuscript.

      (70) L.210: defined in this way, in my world this is not what is generally taken to be a selection differential. Is what you show not simply scaled lifetime reproductive success?

      As far as we are aware, a selection differential is the relative change between a given group and the population mean, which is what we have done here. We appreciate this is a slightly unusual context in which to place this, but it is more logical to consider the individuals who produce more offspring as carrying a potential mutation for higher productivity. However, we believe that “selection differential” is the best terminology for the statistic we present. We also detail in our methodology how we calculate this. We have adjusted this sentence to be more explicit about what we mean by selection differential.

      (71) L.177-180: is this not so because these parameter values are closest to the data you based your estimates on, which yielded a low estimate and hence you see that here also?

      We are unsure of what exactly the reviewer means here. The effect sizes for our exemplar species were predicted from each combination of clutch size and survival rate. Note that we used a range of effect sizes, higher than that estimated in our meta-analysis, to explore a large parameter space and that these same conclusions still hold.

      (72) L.191-194: these statements are problematic, because based on the assumption that an increase in brood size does not impact the fitness prospects of the offspring, and we know this assumption to be false.

      Though we appreciate that some cost is often absorbed by the offspring themselves, we are unaware of any evidence that these costs are substantial and large enough to drive within-species variation in reproductive effort, though for some specific species this may be the case. However, in terms of explaining a generalisable, across-species trend, the fitness costs incurred by a reduction in offspring quality are unlikely to be significantly larger than the survival costs to reproduce. We also find it highly unlikely the cost to fitness incurred by a reduction in offspring quality is large enough to counter-balance the effect of parental quality that we find in our observational studies. We do also discuss other costs in our discussion.

      (73) L.205: here and in other places it would be useful to be more explicit on whether in your discussion you are referring to observational or experimental variation.

      We have added this detail to our manuscript. Do note that many of our conclusions are drawn by the combination of results of experimental and observational studies. We believe the addition of Figure 5 makes this more clear to the reader.

      (74) L.225: this may be true (at least, when we overlook the misuse of the word 'quality' here), but I would expect some nuance here to reflect that there is no surprise at all in this result as this pattern is generally recognized in the literature and has been the (empirical) basis for the often-repeated explanation of why experiments are required to demonstrate trade-offs. On a more quantitative level, it is worth mentioning the paper of Vedder & Bouwhuis (2017, Oikos) that essentially shows the same thing, i.e. a positive association between reproductive output and parental survival.

      We have added some discussion on this point, including adding the citation mentioned. However, we would like to highlight that our results demonstrate that brood manipulations are not necessarily a good test of trade-offs, as they fail to recognise that individuals differ in their underlying quality. Though we agree that this result should not necessarily be a surprising one, we have also not found it to be the case that differences in individual quality are accepted as the reason that intra-specific clutch size is maintained – in fact, we find that it is most commonly argued that when costs of reproduction are not identifiedit is concluded that the costs must be elsewhere – yet we cannot find conclusive evidence that the costs of reproduction (wherever they lie) are driving intra-specific variation in reproductive effort. Furthermore, some studies in our dataset have reported negative correlations between reproductive effort and survival (see observational studies, Figure 1).

      (75) L.225-226: perhaps present this definition when you first use the term.

      We have added more detail to where we first use and define this term to improve clarity (L57-58).

      (76) L.227-228, "currently unknown": this statement surprised me, given that there is a plethora of studies showing within-population variation in clutch size to depend on environmental conditions, in particular the rate at which food can be gathered.

      We mean to question that if an individual is “high quality”, why is it not selected for? We have rephrased, to improve clarity.

      (77) L.231: this seems no more than a special case of the environmental effect you mention above.

      We think this is a relevant special case, as it constitutes within-individual variation in reproduction that is mistaken for between-individual variation. This is a common problem in our field, that we feel needs adressing. We only have between-individual variation here in our study on quality, and by highlighting this we show that there might not be any variation between individuals, but this could come about fully (doubtful) or partly (perhaps likely) due to terminal effects.

      (78) L235-236: but apparently depending on how experimental and natural variation was expressed? Please specify here.

      We are not sure what results the reviewer is referring to here, as we found the same effect (smaller clutch laying species are more severely affected by a change in clutch size) for both clutch size expressed as raw clutch size and standardised clutch size.

      (79) L.237: the concept of 'limits' is not very productive here, and it conflicts with the optimality approach you apply elsewhere. What you are saying here can also be interpreted as there being a non-linear relationship between brood size manipulation and parental survival, but you do not actually test for that. A way to do this would be to treat brood size reduction and enlargement separately. Trade-off curves are not generally expected to be linear, so this would also make more sense biologically than your current approach.

      We have replaced “limits” with “optima”. We believe our current approach of treating clutch size as a continuous variable, regardless of manipulation direction, is the best approach, as it allows us to directly compare with observational studies and between species that use different manipulations (now nicely illustrated by the reviewer’s suggested Figure S1). Also note that transforming clutch size to a proportion of the mean allows us to account for the severity in change in clutch size. We also do not believe that treating reductions and enlargements separately accounts for non-linearity, as either we are separating this into two linear relationships (one for enlargements and one for reductions) or we compare all enlargements/reductions to the control, as in Santos & Nakagawa 2012, which does not take into account the severity of the increase, which we would argue is worse for accounting for non-linearity. Furthermore, in the cases where the manipulation involved one offspring only, we also cannot account for non-linearity.

      (80) L.239: assuming birds are on average able to optimize their clutch size, one could argue that any manipulation, large or small, on average forces birds to raise a number of offspring that deviates from their natural optimum. At this point, it would be interesting to discuss in some detail studies with manipulation designs that included different levels of brood size reduction/enlargement.

      We agree with the reviewer that any manipulation is changing an individual’sclutch size away from its own individual optima, which we have argued also means brood manipulations are not necessarily a good test of whether a trade-off occurs in the wild (naturally), as there could be interactions with quality – we have now edited to explicitly state this (L299-300).

      (81) L.242-244: when you choose to maintain this statement, please add something along the lines of "assuming there is no trade-off between number and quality of offspring".

      As explained above, though we agree that the offspring may incur some of the cost themselves, we are not aware of any evidence suggesting this trade-off is also large enough to drive intra-specific variation in clutch size across species. Furthermore, in the context here, the trade-off between number and quality of offspring would not change our conclusion – that the fitness benefit of raising more offspring is offset by the cost on survival. We have added detail on the costs incurred by offspring earlier in our discussion (L309-315). The addition of Figure 5 should help interpret these data.

      (82) L.253: instead of reference 30 the paper by Tinbergen et al in Behaviour (1990) seems more appropriate.

      We believe our current citation is relevant here but we have also added the Tinbergen et al (1990) citation.

      (83) L.253-254: such trade-offs may perfectly explain variation in reproductive effort within species if we were able to estimate cost-benefit relations for individuals. In fact, reference 29 goes some way to achieve this, by explaining seasonal variation in reproductive effort.

      We are unaware of any quantitative evidence that any combination of trade-offs explains intra-specific variation in reproductive effort, especially as a general across-species trend.

      (84) L.255: how does one demonstrate "between species life-history trade-offs"? The 'trade-off' between reproductive rate and survival we observe between species is not necessarily causal, and hence may not really be a trade-off but due to other factors - demonstrating causality requires some form of experimental manipulation.

      Between-species trade-offs are well established in the field, stemming from GC Williams’ seminal paper in 1966, and for example in r/K selection theory. It is possible to move from these correlations to testing for causation, and this is happening currently by introducing transgenes (genes from other species) that promote longevity into shorter-lived species (e.g., naked-mole rat genes into mice). As yet it is unclear what the effects on reproduction are.

      (85) L.256: it is quite a big claim that this is a novel suggestion. In fact, it is a general finding in evolutionary theory that fitness landscapes tend to be rather flat at equilibrium.

      It is important to note here that we simulate the effect size found, and hence this is the novel suggestion, that because the resulting fitness landscape is relatively flat there is no directional selection observed. We did not intend to suggest our interpretation of flat fitness landscapes is novel. We have changed the phrasing of this sentence to avoid misinterpretation.

      (86) L.259: why bring up physiological 'costs' here, given that you focus on fitness costs? Do you perhaps mean fitness costs instead of physiological costs? Furthermore, here and in the remainder of this paragraph it would be useful to be more specific on whether you are considering natural or experimental variation.

      The cost of survival is a physiological cost incurred by the reduction of self-maintenance as a result of lower resource allocation. This is one arm of fitness; we feel it would be confusing here to talk about costs to fitness, as we do not assess costs to future reproduction (which formed the large part of the critique offered by the reviewer). We would like to highlight that the aim of this manuscript was to separate costs of reproduction from the effects of quality, and this is why we have observational and experimental studies in one analysis, rather than separately. Our conclusion that we have found no evidence that the survival cost to reproduce drives within-species variation in clutch size comes both from the positive correlation found in the observational studies and our negligible fitness return estimates in our simulations. We therefore, do not believe it is helpful to separate observational and experimental conclusions throughout our manuscript, as the point is that they are inherently linked. We hope that with the addition of Figure 5 that this is more clear.

      (87) L.262: The finding that naturally more productive individuals tend to also survive better one could say is by definition explained by variation in 'quality', how else would you define quality?

      We agree, and hence we believe quality is a good term to describe individuals who perform highly in two different traits. Note that we also say the lack of evidence that trade-offs drive intra-specific variation in clutch size also potentially suggests an alternative theory, including intra-specific variation driven by differences in individual quality.

      Supplementary information

      (88) Table S1: please provide details on how the treatment was coded - this information is needed to derive the estimates of the clutch size effect for the treatments separately.

      We have added this detail.

      (89) Table S2: please report the number of effect sizes included in each of these models.

      We have added this detail.

      (90) Table S4: references are not given. Mentioning species here would be useful. For example, Ashcroft (1979) studied puffins, which lay a single egg, making me wonder what is meant when mentioning "No clutch or brood size given" as the reason for exclusion. A few more words to explain why specific studies were excluded would be useful. For example, what does "Clutch size groups too large" mean? It surprises me that studies are excluded because "No standard deviation reported for survival" - as the exact distribution is known when sample size and proportion of survivors is known.

      We have updated this table for more clarity.

      (91) Fig.S1: please plot different panels with the same scale (separately for observational and experimental studies). You could add the individual data points to these plots - or at least indicate the sample size for the different categories (female, male, mixed).

      We have scaled all panels to have the same y axis and added sample sizes to the figure legend.

      (92) Fig.S3: please provide separate plots for experimental and observational studies, as it seems entirely plausible that the risk of publication bias is larger for observational studies - in particular those that did not also include a brood size manipulation. At the same time, one can wonder what a potential publication bias among observational studies would represent, given that apparently you did not attempt to collect all studies that reported the relevant information.

      We have coloured the points for experimental and observational studies. Note that a study is an independent effect size and, therefore, does not indicate whether multiple data (i.e., both experimental and observational studies) came from the same paper. As we detail in the paper and above in our reviewer responses, we searched for observational studies from species used in the experimental studies to allow direct comparison between observational and experimental datasets.

      Reviewer #2 (Recommendations For The Authors):

      I strongly recommend improving the theoretical component of the analysis by providing a solid theoretical framework before, from it, drawing conclusions.

      This, at a minimum, requires a statistical model and most importantly a mechanistic model describing the assumed relationships.

      We thank the reviewer for highlighting that our aims and methodology are unclear in places. We have added detail to our model and simulation descriptions and have improved the description of our rationale. We also feel the failure of the journal to provide code and data to the reviewers has not helped their appreciation of our methodology and use of data.

      Because the field uses the same wording for different concepts and different wording for the same concept, a glossary is also necessary.

      We thank the reviewer for raising this issue. During the revision of this manuscript, we have simplified our terminology or given a definition, and we believe this is sufficient for readers to understand our terminology.

      Reviewer #3 (Recommendations For The Authors):

      • The files containing information of data extracted from each study were not available so it has not been possible to check how any of the points raised above apply to the species included in the study. The ms should include this file on the Supp. Info as is standard good practice for a comparative analysis.

      We supplied a link to our full dataset and the code we used in Dryad with our submitted manuscript. We were also asked to supply our data during the review process and we again supplied a link to our dataset and code, along with a folder containing the data and code itself. We received confirmation that the reviewers had been given our data and code. We support open science and it was our intention that our dataset should be fully available to reviewers and readers. We believe the data is too large to include as a table in the main text and is not essential in understanding the paper. Our data and code are at https://doi.org/10.5061/dryad.q83bk3jnk.

      • For clarity, refer to 'the effect size of clutch size on survival" rather than simply "effect size". Figures 1 and 2 require cross-referencing with the main text to understand the y-axis.

      We have added detail to the figure legend to increase the interpretability of the figures.

      • Silhouettes in Figure 3 (or photos) would help readers without ornithological expertise to understand the taxonomic range of the species included in the analyses.

      We have added silhouettes into Figure 3.

      • Throughout the discussion: superscripts shouldn't be treated as words in a sentence so please add authors' names where appropriate.

      We have added author names and dates where required.

    1. https://web.archive.org/web/20240528070547/https://shkspr.mobi/blog/2023/05/the-limits-of-general-purpose-computation/

      Terence Eden (posted #2023/05/28 ) on the question if an app provider does have a say on being willing to run their code on your device, in contrast me being in control of a device and determining which code to run there or not. In this case a bank that would disallow their app on a rooted phone, because of risk profiles attached to that. Interesting tension: my risk assessment, control over general computation devices versus a service provider for which their software is a conduit and their risk assessments. I suspect the issue underneath this is such tensions need to be a conversation or negotiation to resolve, but in practice it's a dictate by one party based on a power differential (the bank controls your money, so they can set demands for your device, because you will need to keep access to your account.)

    1. User Interface: Also known as the presentation layer, it is responsible for all user interaction, handling the display of data, and processing inputs and interface events such as button clicks and text highlighting. Usually, this layer is implemented as a desktop application. For example, an academic system should provide a graphical interface for instructors to enter grades for their classes. The main element of this interface can be a form with two columns: student name and grade. The code implementing this form resides in the interface layer.

      tính năng người dùng hỗ trợ tương tác, hiển thị và các thứ

    1. Summary of "Revised Report on the Propagator Model" by Alexey Radul and Gerald Jay Sussman

      Introduction

      • Main Problem: Traditional programming models hinder extending existing programs for new situations due to rigid commitments in the code.
      • Quote: "The most important problem facing a programmer is the revision of an existing program to extend it for some new situation."
      • Solution: The Propagator Programming Model supports multiple viewpoints and integration of redundant solutions to aid program extensibility.
      • Quote: "The Propagator Programming Model is an attempt to mitigate this problem."

      Propagator Programming Model

      • Core Concept: Autonomous machines (propagators) communicate via shared cells, continuously adding information based on computations.
      • Quote: "The basic computational elements are autonomous machines interconnected by shared cells through which they communicate."
      • Additivity: New contributions are seamlessly integrated by adding new propagators without disrupting existing computations.
      • Quote: "New ways to make contributions can be added just by adding new propagators."

      Propagator System

      • Language Independence: The model can be implemented in any programming language as long as a communication protocol is maintained.
      • Quote: "You should be able to write propagators in any language you choose."
      • Cell Operations: Cells support adding content, collecting content, and registering propagators for notifications on content changes.
      • Quote: "Cells must support three operations: add some content, collect the content currently accumulated, register a propagator to be notified when the accumulated content changes."

      Implementing Propagator Networks

      • Creating Cells and Propagators: Cells store data, while propagators compute based on cell data. Propagators are attached using d@ (diagram style) or e@ (expression style) for simpler cases.
      • Quote: "The cells' job is to remember things; the propagators' job is to compute."
      • Example: Adding two and three using propagators.
      • Quote: "(define-cell a) (define-cell b) (add-content a 3) (add-content b 2) (define-cell answer (e:+ a b)) (run) (content answer) ==> 5"

      Advanced Features

      • Conditional Network Construction: Delayed construction using conditional propagators like p:when and p:if to control network growth.
      • Quote: "The switch propagator does conditional propagation -- it only forwards its input to its output if its control is 'true'."
      • Partial Information: Cells accumulate partial information, which can be incrementally refined.
      • Quote: "Each 'memory location' of Scheme-Propagators, that is each cell, maintains not 'a value', but 'all the information it has about a value'."

      Built-in Partial Information Structures

      • Types: Nothing, Just a Value, Numerical Intervals, Propagator Cells, Compound Data, Closures, Truth Maintenance Systems, Contradiction.
      • Quote: "The following partial information structures are provided with Scheme-Propagators: nothing, just a value, intervals, propagator cells, compound data, closures, supported values, truth maintenance systems, contradiction."

      Debugging and Metadata

      • Debugging: Scheme's built-in debugger aids in troubleshooting propagator networks. Metadata tracking for cells and propagators enhances debugging.
      • Quote: "The underlying Scheme debugger is your friend."
      • Metadata: Tracking names and connections of cells and propagators helps navigate and debug networks.
      • Quote: "Inspection procedures using the metadata are provided: name, cell?, content, propagator?, propagator-inputs, propagator-outputs, neighbors, cell-non-readers, cell-connections."

      Benefits of the Propagator Model

      • Additivity and Redundancy: Supports incremental additions and multiple redundant computations, enhancing flexibility and resilience.
      • Quote: "It is easy to add new propagators that implement additional ways to compute any part of the information about a value in a cell."
      • Intrinsic Parallelism: Each component operates independently, making the model naturally parallel and race condition-resistant.
      • Quote: "The paradigm of monotonically accumulating information makes [race conditions] irrelevant to the final results of a computation."
      • Dependency Tracking: Facilitates easier integration and conflict resolution via premises and truth maintenance.
      • Quote: "If the addition turns out to conflict with what was already there, it (or the offending old thing) can be ignored, locally and dynamically, by retracting a premise."

      Conclusion

      • Goal Achievement: The Propagator Model approaches goals of extensibility and additivity by allowing flexible integration and redundancy in computations.
      • Quote: "Systems built on the Propagator Model of computation can approach some of these goals."
    1. Résumé de la vidéo [00:00:00][^1^][1] - [00:29:11][^2^][2]:

      Cette vidéo présente une discussion sur le harcèlement et la violence à l'école, animée par Myriam Ilouz et Catherine Perelmutter. Elle aborde les impacts psychologiques du harcèlement, son évolution à travers le temps, et les défis de l'identification des harceleurs dans un contexte scolaire.

      Points forts: + [00:00:00][^3^][3] Introduction et contexte * Présentation de Myriam Ilouz, psychologue clinicienne * Importance du sujet du harcèlement à l'école + [00:01:00][^4^][4] Le harcèlement à l'école * Impact du harcèlement sur l'introduction de l'enfant au social * La souffrance des victimes et des harceleurs + [00:03:02][^5^][5] Changement dans la nature du harcèlement * Augmentation de l'intensité et de la violence du harcèlement * Utilisation des médias modernes pour harceler + [00:10:02][^6^][6] Caractéristiques des harceleurs * Agressivité, nuisance intentionnelle et répétée * Instauration d'une relation de disymétrie sociale + [00:20:01][^7^][7] La perversion et le harcèlement * Le harcèlement comme manifestation de la perversion narcissique * Difficulté à démasquer les harceleurs et à protéger les victimes + [00:27:02][^8^][8] Conséquences sociétales du harcèlement * Impact sur la confiance en l'école et la société * Nécessité d'une éducation solide pour prévenir la perversion Résumé de la vidéo [00:29:13][^1^][1] - [00:56:27][^2^][2]:

      La vidéo aborde le problème du harcèlement et de la violence à l'école, en se concentrant sur les changements dans l'éducation des enfants, l'impact de la société moderne sur le comportement des élèves, et les défis rencontrés par les enseignants et les parents. Elle souligne l'importance de comprendre la loi, la frustration et la castration symbolique dans l'éducation pour prévenir le harcèlement.

      Points forts: + [00:29:13][^3^][3] L'évolution de l'éducation des enfants * La psychanalyse française et son impact * La création d'enfants rois et le manque de règles claires * L'importance de la loi et de la frustration dans l'éducation + [00:32:35][^4^][4] Le rôle des parents et des enseignants * La disparition de l'autorité et de la hiérarchie * La nécessité d'une alliance éducative entre l'école et la famille * L'importance du respect de l'autorité des enseignants + [00:39:01][^5^][5] Le harcèlement scolaire et la justice * Définition légale et formes de harcèlement * L'impact des réseaux sociaux sur le harcèlement * La vulnérabilité des victimes et l'importance de la prévention + [00:47:02][^6^][6] Les mesures et les lois contre le harcèlement * Statistiques et plans gouvernementaux * Le droit à l'éducation et la protection contre la violence * Les initiatives pour améliorer le climat scolaire et prévenir le harcèlement Résumé de la vidéo [00:56:29][^1^][1] - [01:14:14][^2^][2]:

      Cette vidéo aborde le harcèlement et la violence à l'école, en se concentrant sur les aspects juridiques du harcèlement en France, notamment les changements apportés par la loi du 2 mars 2022. Elle explique les nouvelles dispositions du Code pénal français concernant le harcèlement scolaire, les peines encourues et les mesures à prendre pour prouver le harcèlement et obtenir justice.

      Points forts: + [00:56:29][^3^][3] Cadre juridique du harcèlement * Discussion sur le Code pénal français et la loi du 2 mars 2022 * Explication des articles relatifs au harcèlement * Importance de la preuve et des démarches légales + [01:01:14][^4^][4] Cas judiciaire spécifique * Examen d'un jugement du tribunal pour enfants d'Épinal * Analyse de la causalité entre le harcèlement et le suicide * Mention d'un appel possible devant la cour d'appel + [01:07:58][^5^][5] Témoignage d'une mère * Récit d'une mère sur le harcèlement subi par sa fille * Difficultés rencontrées dans la prise en charge et la justice * Appel à des actions concrètes pour soutenir les victimes

    1. Résumé de la vidéo [00:00:00][^1^][1] - [00:26:57][^2^][2]:

      Cette vidéo présente le Conseil Économique, Social et Environnemental (CESE) en France, ses missions, sa composition et son impact sur la société. Le CESE est décrit comme un pont entre les citoyens et les pouvoirs publics, offrant une plateforme pour la démocratie participative et l'élaboration de politiques publiques.

      Points forts: + [00:00:00][^3^][3] Rôle et missions du CESE * Conseille le gouvernement et le Parlement * Favorise la démocratie participative * Évalue l'efficacité des politiques publiques + [00:06:29][^4^][4] Composition du CESE * 175 conseillers issus de divers secteurs * Représentation de la société civile organisée * Groupes d'intérêt et affinités variés + [00:14:15][^5^][5] Débats et propositions * Discussions sur des sujets d'actualité et de société * Interventions des membres sur des thématiques variées * Propositions pour améliorer la vie quotidienne + [00:21:05][^6^][6] Inégalités de genre, crise climatique et transition écologique * Analyse de l'impact du genre sur les questions écologiques * Vulnérabilité des femmes face aux crises * Rôle des femmes dans la promotion de la durabilité Résumé de la vidéo [00:26:59][^1^][1] - [00:53:06][^2^][2]:

      La vidéo présente une discussion sur les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle aborde l'écoféminisme, la mixité des métiers, et l'impact du changement climatique sur les femmes.

      Points forts: + [00:27:02][^3^][3] L'écoféminisme * Parallèle entre la domination de la nature et celle des femmes * Vision d'une société sans patriarcat ni domination * Importance de renouer avec le vivant + [00:29:17][^4^][4] Les stéréotypes de genre * Impact des stéréotypes dès l'enfance * Influence sur la vie et le rapport à la nature * Nécessité de promouvoir la mixité des métiers + [00:31:18][^5^][5] L'égalité de genre dans les politiques publiques * Lien entre égalité de genre et action pour le vivant * Intégration des réalités de genre dans les solutions climatiques * Importance de la diplomatie féministe et du financement des associations féministes + [00:38:04][^6^][6] La participation des femmes à la lutte environnementale * Femmes comme actrices majeures de la lutte pour l'environnement * Changement de paradigme pour valoriser leurs compétences * Connexion entre les questions sociales et environnementales Résumé de la vidéo [00:53:08][^1^][1] - [01:17:09][^2^][2]:

      La vidéo présente les solutions pour construire une société durable et respectueuse de l'égalité de genre, en se concentrant sur les impacts différenciés du changement climatique sur les femmes et les hommes. Elle souligne l'importance de l'intégration de l'égalité de genre dans les politiques environnementales et la nécessité d'une action concrète pour protéger les droits des femmes.

      Points forts: + [00:53:08][^3^][3] Introduction et quiz * Présentation des recommandations principales * Quiz interactif pour évaluer les connaissances sur l'égalité de genre * Importance de l'égalité de genre dans la gestion des catastrophes + [01:00:04][^4^][4] Impact différencié du changement climatique * Les femmes sont affectées de manière disproportionnée par les catastrophes climatiques * Les crises climatiques augmentent les violences envers les femmes * Nécessité de soutenir les projets portés par les femmes + [01:05:03][^5^][5] Intégration de l'égalité de genre dans les politiques * La diplomatie féministe de la France et ses implications * L'importance de l'évaluation des engagements internationaux * La sécurité des femmes déplacées par les changements climatiques + [01:14:00][^6^][6] Conséquences des activités industrialisées * Les pays riches sont responsables des crises climatiques * Les pays en développement sont les plus touchés * Appel à la protection juridique des migrants environnementaux Résumé de la vidéo [01:17:11][^1^][1] - [01:39:37][^2^][2] :

      Cette partie de la vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle met en lumière l'intégration des questions de genre dans les politiques environnementales, l'importance de l'investissement public dans la transition écologique, et le rôle des collectivités territoriales et des entreprises dans la promotion de l'égalité de genre.

      Points forts : + [01:17:11][^3^][3] Intégration du genre dans la fiscalité environnementale * Éviter de renforcer les inégalités existantes * Corriger les inégalités à travers les investissements publics * Stratégie française pour l'énergie et le climat + [01:18:00][^4^][4] Objectifs transversaux d'écologie et d'égalité * Intégrer les objectifs d'écologie et de réduction des inégalités * Documenter avec des données spécifiques au genre * Chaque euro dépensé doit également bénéficier à l'égalité de genre + [01:20:04][^5^][5] Politique de mobilité et impact sur les femmes * Exemple de la promotion du vélo et ses conséquences sur l'espace public * Nécessité pour les collectivités de croiser les thématiques d'environnement et de genre * Politiques inclusives comme celles de la Ville de Genève + [01:24:04][^6^][6] Inégalités professionnelles dans les métiers verdissants * Sous-représentation des femmes dans les secteurs émetteurs de gaz à effet de serre * Importance de l'inclusion des femmes dans la transition écologique * Lever les obstacles à la participation des femmes dans ces métiers Résumé de la vidéo [01:39:39][^1^][1] - [02:03:26][^2^][2] : La vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle met en lumière les défis et les préconisations du Conseil Économique, Social et Environnemental (CESE) en France, notamment en matière d'accueil des réfugiés, de politiques publiques, de biodiversité, de pollution et de participation démocratique.

      Points forts : + [01:40:00][^3^][3] Défis et réactions face à l'égalité de genre * Discussion sur les réactions négatives aux travaux sur le genre * Confusion entre genre masculin et masculinité * Importance de prévenir les réactions négatives + [01:41:06][^4^][4] Intégration des questions de genre dans les politiques environnementales * Lien entre biodiversité, pollution et inégalités de genre * Nécessité d'une approche intégrée et détaillée * Choix difficiles dans les axes de préconisation + [01:45:52][^5^][5] Accueil des réfugiés et prise en charge spécifique des femmes et des filles * Préconisation d'intégrer une jurisprudence dans le Code de l'entrée et du séjour des étrangers * Besoins spécifiques des femmes et des filles réfugiées * Importance de projets spécifiques et de soutien financier + [01:57:43][^6^][6] Importance des données ventilées par sexe pour les politiques publiques * Collecte de données pour mieux connaître et agir * Évaluation continue des outils existants * Nécessité d'améliorer l'index d'égalité professionnelle Résumé de la vidéo [02:03:29][^1^][1] - [02:29:31][^2^][2]:

      La vidéo aborde les solutions pour construire une société durable respectant l'égalité des genres. Elle souligne l'importance de l'intégration des femmes dans les métiers verts, la nécessité d'une diplomatie féministe, et l'impact du changement climatique sur les femmes. Elle appelle à une meilleure représentation des femmes dans les décisions politiques et environnementales, et à l'adoption de politiques publiques sensibles au genre.

      Points forts: + [02:03:29][^3^][3] Impact du changement climatique sur les femmes * Importance de la recherche sur les différences d'impact entre les sexes * Nécessité d'une meilleure représentation des femmes dans les métiers verts * Appel à une diplomatie féministe et à des politiques publiques adaptées + [02:06:00][^4^][4] Justice de genre et justice climatique * Lien entre la préservation de la planète et l'évolution de la société * La justice de genre comme élément central de la justice climatique * Les politiques publiques doivent intégrer l'égalité des sexes + [02:10:21][^5^][5] Rôle des femmes dans la lutte contre la crise climatique * Les femmes sont plus vulnérables et exposées aux catastrophes naturelles * Nécessité de reconnaître et promouvoir les innovations des femmes * Importance de l'égalité des sexes pour évoluer les politiques publiques + [02:15:16][^6^][6] Intégration du genre dans la transition écologique * Les inégalités de genre exacerbent l'impact de la crise climatique * Propositions pour intégrer le genre dans les stratégies d'adaptation climatique * Valorisation de l'action des femmes et leur intégration dans la prise de décision + [02:22:14][^7^][7] Engagement des femmes dans la transition écologique * Les femmes doivent être des actrices majeures dans la lutte contre le changement climatique * Nécessité d'une approche transversale des politiques climatiques et d'égalité * Importance de la collecte de données sexo-spécifiques pour informer les politiques + [02:27:00][^8^][8] Rôle des femmes dans l'agriculture et la production bio * Évolution du secteur agricole avec une augmentation des femmes chefs d'exploitation * Défis rencontrés par les femmes dans l'adaptation au changement climatique * L'émancipation économique des femmes comme objectif pour la justice sociale et climatique Résumé de la vidéo [02:29:33][^1^][1] - [02:42:34][^2^][2]:

      Cette partie de la vidéo aborde les solutions pour construire une société durable et respectueuse de l'égalité de genre. Elle souligne l'importance d'intégrer les spécificités de genre dans les politiques nationales et internationales, en particulier en ce qui concerne les conséquences du dérèglement climatique sur les femmes. La vidéo met en avant la nécessité de sensibiliser et d'accompagner les acteurs économiques sur ces enjeux, ainsi que de lutter contre les stéréotypes de genre dans les métiers verts.

      Points forts: + [02:29:33][^3^][3] Intégration des spécificités de genre * Importance dans les politiques face au climat * Impact disproportionné sur les femmes * Solutions portées par les femmes + [02:31:38][^4^][4] Sensibilisation et éducation * Importance de la sensibilisation dès l'école * Lutte contre les stéréotypes de genre * Mixité des métiers verts + [02:34:15][^5^][5] Données genrées et politiques publiques * Nécessité de données pour réduire les inégalités * Engagement dans de nouvelles politiques * Participation des femmes aux décisions + [02:36:28][^6^][6] Diplomatie féministe et développement durable * Spécificité des revendications des femmes * Importance de l'éducation et de la formation * Accès des femmes à tous les métiers

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02393

      Corresponding author(s): Katja Petzold

      1. General Statements [optional]

      We thank the reviewers for recognising the impact of our manuscript. The reviewers noted the novelty of the miRNA bulge structure, the importance of the three observed binding modes and their potential for use in future structure-based drug design, and the possible importance of the duplex release phenomenon. We are also thankful for the relevant and constructive feedback provided.

      Our responses to the comments are written point by point in blue, and any changes in the manuscript are shown in red.

      2. Description of the planned revisions

      In response to Reviewer 1 - major comment 2

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds.

      Since the effect appears to be specific to the miRNA, we would like to test whether it can be observed for miR-34a in a larger dataset. Therefore, we plan to transfect HEK293T cells with miR-34a and analyse the mRNA response via RNAseq. We will repeat the analysis shown above, using the predicted number of supplementary pairs to categorise the dataset into groups with or without the effect of supplementary pairing. We will then compare the three seed types within these groups.

      In response to Reviewer 2 - minor comment 1, "why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?".

      We plan to investigate the upper band, which we hypothesise is a result of duplex release, using EMSA to ascertain whether the band height agrees with the size of the duplex.

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      Evidence, reproducibility and clarity

      Sweetapple et al. Biophysics of microRNA-34a targeting and its influence on down-regulation

      In this study, the authors have investigated binding of miR-34a to a panel of natural target sequences using EMSA, luciferase reporter systems and structural probing. The authors compared binding within a binary and a ternary complex that included Ago2 and find that Ago2 affects affinity and strengthens weak binders and weakens strong binders. The affinity is, however, generally determined by binary RNA-RNA interactions also in the ternary complex. Luciferase reporter assays containing 12 different target sites that belong to one of three seed-match types were tested. Generally, affinity is a strong contributor to repression efficiency. Duplex release, a phenomenon observed for specific miRNA-target complementarities, seems to be more pronounced when high affinity within the binary complex is observed. Furthermore, the authors use RABS for structural probing either in a construct in CIS or binding by the individual miRNA in TRANS or in a complex with Ago2. They find pronounced asymmetric target binding and Ago2 does not generally change the binding pattern. The authors observe one specific structural group that was unexpected, which was mRNA binding with bulged miRNAs, which was expected sterically problematic based on the known structures. MD simulations, however, revealed that such structures could indeed form.

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings that are summarized below.

      The manuscript is not easy to read and to follow for several reasons. First, many of the sub-Figures are not referenced in the text of the results section (1C, 1D, 2C, 4D), which is somewhat annoying. Figure 4A seems to be mis-labeled. Second, a lot of data is presented in suppl. Figures. It should be considered to move more data into the main text in order to make it easier for readers to evaluate and follow.

      Thank you for bringing this to our attention. We have now revised the figure references accordingly.

      We have relocated gel images of BCL2, WNT1, MTA2 and the control samples from Figure S3 and S4 to the main results (Figure 2A-B) to improve readability and provide controls and details that aid in clear understanding. Additionally, we have relocated panel C from Figure S6 to Figure 2C to enhance the clarity of our rationale for using polyuridine (pU) in our AGO2 binding assays.

      The updated figure is shown below, with changes to the legend marked in red.

      Figure 2. Binary and ternary____ complex binding affinities measured by EMSA. (A) Binary (mRNA:miR-34a) binding assays showing examples of BCL2, WNT1 and MTA2. (B) Ternary (mRNA:miR-34a-AGO2) binding assays showing examples of BCL2, WNT1, MTA2, and the three control targets PERFECT, SCRseed, and SCRall. The Cy5 labelled species is indicated with asterisk (*). F indicates the free labelled species (miR34a or mRNA), B indicates binary complex, and T indicates ternary complex. Adjacent titrations points differ two-fold in concentration, with maximum concentrations stated at the top right. Adjacent titration points for MTA2 differed three-fold to assess a wider concentration range. In theternary assay, miRNA duplex release from AGO2 was observed for amongst others BCL2, WNT1, PERFECT, and SCRseed (band indicated with B), while it was not observed for SCRall and MTA2. See Figures S3 and S4 for representative gel images for all targets. See Supplementary files 2 and 3 for all images and replicates. (C) Titrations with increasing miR-34a-AGO2 concentration against Cy5-labelled SCRall (left) or PNUTS (right) comparing the absence and presence of 20 μM polyuridine (pU) during equilibration. pU acted as a blocking agent, reducing nonspecific binding, as seen by the different KD,app values for SCRall and PNUTS after addition of 20 μM pU. Therefore, all final mRNA:miR-34a-AGO2 EMSAs were carried out in the presence of 20 μM pU. Labels are as stated above. (D) Individual binding profiles for each of the 12 mRNA targets assessed by electrophoretic mobility assay (EMSA). Each datapoint represents an individual experiment (n=3). Blue represents results for the binary complex, and green represents results for the ternary complex. Dotted horizontal lines represent the KD,app values, which are also stated in blue and green with standard deviations (units = nM). Note that the x-axis spans from 0.1 to 100,000 in CCND1, MTA2 and NOTCH2, whereas the remaining targets span 0.1 to 10,000.

      Some of the data is over-interpreted. For example, in Figure 3A, it is concluded that supplementary regions are more important for weaker seeds. Only two 8-mer seeds are present among the twelve target sites and thus it might be difficult to generalize.

      We have revised our wording to recognise that more 8-mer sites would be required to draw a stronger conclusion based on this hypothesis. This hypothesis would be interesting to confirm in a larger dataset but is unfortunately outside of the scope of this paper.

      Our hypothesis also aligns with recent data from Kosek et al. (NAR 2023; Figure 2D) where SIRT1 with an 8mer and 7mer-A1 seed was compared. Only the 7mer-A1 was sensitive to mutations in the central region or switching all mismatched to WC pairs.

      Page 21 now states:

      "This result indicates that the impact of supplementary binding may be greater for targets with weaker seeds, as has been observed earlier in a mutation study of miR-34a binding to SIRT1 (Kosek et al., 2023), although a larger sample size would be needed to confirm this observation."

      Furthermore, we found the relationship between seed type and the effect of supplementary pairing in our data intriguing. To further investigate this effect, we tested whether it exists in published microarray data from HCT116 cells transfected with six different miRNAs (Linsley et al., 2007; Argawal et al., 2015). Here we found that the for the two miRNAs (miR-103 and miR-106b) where we see an impact of supplementary pairing, the difference is primarily driven by 7mer-m8 seeds. We therefore plan to test whether the effect can be observed for miR-34a in a larger dataset. We have outlined our preliminary data and planned experiments in Section 2 - description of the planned revisions.

      I did not understand why the CIS system shown in 4A is a good test case for miR-34a-target binding. It appears very unnatural and artificial. This needs to be rationalized better. Otherwise it remains questionable, whether these data are meaningful at all.

      Thank you for pointing out the need for clearer rationalisation.

      The TRANS construct, where the scaffold carries the mRNA targeting sequence, provides reactivity information for the mRNA side only, while the microRNA is bound within RISC, with the backbone protected by AGO2. Therefore, to gain information on the miR-34a side of each complex we used the CIS construct, which provides reactivity information from both the miRNA and mRNA. We used the miRNA and mRNA reactivities to calculate all possible secondary structures for the binary complex, and then compared these structures to the mRNA reactivity in TRANS to find which structure fitted the reactivity patterns observed in the ternary complex.

      We have included an additional statement in the manuscript to clarify this point on pages 12-13:

      "Two RNA scaffolds were used for each mRNA target; i) a CIS-scaffold: RNA scaffold containing both mRNA target and miRNA sequence separated by a 10 nucleotide non-interacting closing loop, and ii) a TRANS-scaffold: RNA scaffold containing only the mRNA target sequence, to which free miR-34a or the miR-34a-AGO2 complex was bound (Figure 4A). The CIS constructs therefore provided reactivity information on the miRNA side, which is lacking in the TRANS construct, and was used to complement the TRANS data."

      It may be worthwhile noting that a non-interacting 10 nucleotide loop was inserted between then miRNA and mRNA of the CIS constructs, allowing the miRNA and mRNA strands to bind and release freely. The reactivity patterns of each mRNA:miRNA duplex were compared between CIS and TRANS, and showed similar base pairing (Figure 4D). Furthermore, we have previously compared the two scaffolds in our RABS methodology paper (Banijamali et al. 2022), where no differences were observed besides reduced end fraying in the CIS construct.

      For the TRANS experiments, only one specific scaffold structure is used. This structure might impact binding as well and thus at least one additional and independent scaffold should be selected for a generalized statement.

      For each construct, the potential of interaction with the scaffold was tested using the RNAstructure (Reuter & Mathews, 2010)package. Based on the results of this assessment, two different scaffolds were used for our TRANS experiments. The testing and use of scaffolds has now been clarified further on page 13:

      "The overall conformation of each scaffold with the inserted RNA was assessed using the RNAstructure (Reuter & Mathews, 2010) package to ensure that the sequence of interest did not interact with the scaffold. If any interaction was observed between the RNA of interest and the scaffold, then the scaffold was modified until no predicted interaction occurred. The different scaffolds and their sequence details are shown in supplementary information (Table S1)."

      We have previously examined the scaffold's effect on binding and structure during the development of the RABS method. We tested the same mRNA (SIRT1) in separate, independent scaffolds to verify the consistency of the results. An example of this can be found in the supplementary information (Figure S1a) of Banijamali et al. (2022).

      Generally, it would be nice to have some more information about the experiments also in the result section. Recombinant Ago2 is expressed in insect cells and re-loaded with miR-34a, luciferase reporters are transfected into tissue culture cells, I guess.

      We have now stated the cell types used for AGO2 expression and luciferase reporter assays in the results.

      On page 17 we have included:

      "Samples of each of the 12 mRNA targets, as well as miR-34a and AGO2, were synthesised in-house for biophysical and biological characterisation. Target mRNA constructs were produced via solid-phase synthesis while miR-34a was transcribed in vitro and cleaved from a tandem transcript (Feyrer et al., 2020), ensuring a 5' monophosphate group. AGO2 was produced in Sf9 insect cells."

      "To measure the affinity of each mRNA target binding to miR-34a, both within the binary complex (mRNA:miR-34a) and theternary complex (mRNA:miR-34a-AGO2), we optimised an RNA:RNA binding EMSA protocol to suit small RNA interactions. The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions (James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs (Misra & Draper, 1998), and fluorescently labelled probes."

      Page 19:

      " We successfully tested various RNA backgrounds, including polyuridine (pU) and total RNA extract (Figure S6B) to block any unspecific binding. Ultimately, we supplemented our binding buffer with pU at a fixed concentration of 20 µM for the ternary assays to achieve the greatest consistency."

      Page 20:

      "Repression efficacy for the 12 mRNA targets by miR-34a was assessed through a dual luciferase reporter assay6. Target mRNAs were cloned into reporter constructs and transfected into HEK293T cells."

      Page 22:

      "To infer base pairing patterns and secondary structure for each of the 12 mRNA:miR-34a pairs, we used the RABS technique (Banijamali et al., 2023) with 1M7 as a chemical probe. All individual reactivity traces are shown in Figure S9. Reactivity of each of the 22 miR-34a nucleotides was assessed upon binding to each of the 12 mRNA targets within a CIS construct, containing both miR-34a and the mRNA target site separated by a non-interacting 10-nucleotide loop. The two RNAs can therefore bind and release freely within the CIS construct and reactivity information is collected from both RNA strands."

      In the first sentence of the abstract, Argonaute 2 should be replaced by Argonaute only since other members bind to miRNAs as well.

      Thank you for recognising this. It has now been corrected.

      Significance

      This is an interesting manuscript that contributes to our mechanistic understanding of the miRNA-target pairing rules. The combination of affinity measurements, structural probing and luciferase reporters allow for a broad correlation of target binding and repression strength, which is a well-thought and highly conclusive approach. However, there are a number of shortcomings.

      We thank the reviewer for recognising the approach and impact of our work. In addition we thank the reviewer for identifying the need for further data to support our conclusions from the luciferase assays, which is something that we plan to address, as described in section 2.



      Reviewer #2

      Evidence, reproducibility and clarity

      Summary: Sweetapple et al. took the approaches of EMSA, SHAPE, and MD simulations to investigate target recognition by miR-34a in the presence and absence of AGO2. Surprisingly, their EMSA showed that guide unloading occurred even with seed-unpaired targets. Although previous studies reported guide unloading, they used perfectly complementary guide and target sets. The authors of this study concluded that the base-pairing pattern of miR-34a with target RNAs, even without AGO2, can be applicable to understanding target recognition by miR-34a-bound AGO2.

      Major comments:

      (Page 11 and Figure S4) The authors pre-loaded miR-34a into AGO2 and subsequently equilibrated the RISC with a 5' modified Cy5 target mRNA. Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a) in the EMSA (guide unloading has been a long-standing controversy). However, they observed bands of the binary complex in Figure S4. The authors did not use ion-exchange chromatography. AGOs are known to bind RNAs nonspecifically on their positively charged surface. Is it possible that most miR-34a was actually bound to the surface of AGO2 instead of being loaded into the central cleft? This could explain why they observed the bands of the binary complex in EMSA.

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      We have restated the buffer compositions below written the methods section more explicitly to describe this:

      "Following dialysis, any precipitate was removed by centrifugation, and the resulting supernatant was loaded onto a IMAC buffer A-equilibrated HisTrap-Ni2+ column to remove TEV protease, other proteins, and non-specifically bound RNA. A linear gradient was employed using IMAC buffers A and B."

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

      (Page 18 and Figure S5) Previous studies (De et al., Jo MH et al., Park JH et al.) reported guide unloading when they incubated a RISC with a fully complementary target. However, neither MTA2, CCND1, CD44, nor NOTCH2 can be perfectly paired with miR-34a (Figure 1A). Therefore, the unloading reported in this study is quite different from the previously reported works and thus cannot be explained by the previously reported logic. The authors need to explain the guide unloading mechanism that they observed. Otherwise, they might misinterpret the results of their EMSA and RABS of the ternary complex.

      The three aforementioned studies have reported unloading/duplex release. However, they did not only report fully complementary targets in this process.

      De et al. (2013) reported that "highly complementary target RNAs promote release of guide RNAs from human Argonaute2".

      Subsequently, Park et al. (2017) reported: "Strikingly, we showed that miRNA destabilization is dramatically enhanced by an interaction with seedless, non-canonical targets."

      A figure extracted from Figure 5 of Park et al. is shown below illustrating the occurrence of unloading in the presence of seed mismatches in positions 2 and 3 (mm 2-3). Jo et al. (2015) also reported that binding lifetime was not affected by the number of base pairs in the RNA duplex.

      In addition to these three reports, a methodology paper focusing on miRNA duplex release was published recently titled "Detection of MicroRNAs Released from Argonautes" (Min et al., 2020).

      Therefore, we do believe that the previously observed microRNA release is similar to our observation. Here we also correlate it to structure and stability of the complex.

      (Page 20) The authors reported, "it is notable that the seed region binding does not appear to be necessary for duplex release." The crystal structures of AGO2 visualize that the seed of the guide RNA is recognized, whereas the rest is not, except for the 3' end captured by the PAZ domain. How do the authors explain the discrepancy?

      In this manuscript, we intend to present our observations of duplex release. There are many potential relationships between duplex release and AGO2 activity, which we do not have data to speculate upon. Previous studies, such as Park et al. (2017) have also observed non-canonical and seedless targets leading to duplex release, supporting our findings. Additionally, other publications including McGearly et al. (2019) report 3'-only miRNA targets, Lal et al. (2009) have documented seedless binding by miRNA and their downstream biological effects, and Duan et al. (2022) show that a large number of let-7a targets are regulated through 3′ non-seed pairing.

      It is also possible that duplex release is not coupled to classical repression outcomes, and does not need to proceed by the seed, but instead regulates AGO2 recycling before AGO2 enters the quality control mode of recognising the formed seed.

      (Pages 22) The authors mentioned, "It follows that the structure imparted via direct RNA:RNA interaction remains intact within AGO2, highlighting the role of RNA as the structural determinant." A free guide and a target can start their annealing from any nucleotide position. In contrast, a guide loaded into AGO needs to start annealing with targets through the seed region. Additionally, the Zamore group reported that the loaded guide RNA behaves quite differently from its free state (Wee et al., Cell 2012). How do the authors explain the discrepancy?

      The key point we would like to emphasise is that AGO does not seem to alter the underlying RNA:RNA interactions. The bound state in the ternary complex reflects the structure established in the binary complex. We do not aim to claim a specific sequence of events, as this interpretation is not possible from our equilibrium data. Our data indicates that the protein is flexible enough to accommodate the RNA structure that is favoured in the binary complex. This hypothesis is further supported by our MD simulation, which demonstrates the accommodation of a miRNA-bulge structure within AGO2.

      Targets lacking seeds have been identified previously (McGeary et al. 2019, Park et al. 2017, Lal et al. 2009) and can bind to miRNA within AGO. Therefore, there must be a mechanism by which these targets can anneal within AGO, such as via sequence-independent interactions (as discussed in question 3).

      With respect to Wee et al., (2012), which studied fly and mouse AGO2 and found considerable differences between the thermodynamic and kinetic properties of the two AGO2 species. Furthermore, they found different average affinities between the two species, with the fly AGO binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse.

      Below is an extract from Wee et al., (2012):

      "Our KM data and published Argonaute structures (Wang et al., 2009) suggest that 16-17 base pairs form between the guide and the target RNAs, yet the binding affinity of fly Ago2-RISC (KD = 3.7 {plus minus} 0.9 pM, mean {plus minus} S.D.) and mouse AGO2-RISC (KD = 20 {plus minus} 10 pM, mean {plus minus} S.D.) for a fully complementary target was comparable to that of a 10 bp RNA:RNA helix. Thus, Argonaute functions to weaken the binding of the 21 nt siRNA to its fully complementary target: without the protein, the siRNA, base paired from positions g2 to g17, is predicted to have a KD ∼3.0 × 10−11 pM (ΔG25{degree sign}C = −30.7 kcal mol−1). Argonaute raises the KD of the 16 bp RNA:RNA hybrid by a factor of > 1011."

      In the Wee et al. (2012) paper, affinity data on mouse and fly AGO2 was collected via filter binding assays, using a phosphorothioate linkage flanked by 2′-O-methyl ribose at positions 10 and 11 of the target to prevent cleavage. They then compared the experimentally determined mean KD and ΔG values for each species to predicted values of an RNA:RNA helix of 16-17 base-pairs. No comparison was made between individual targets, and no experimental data was collected for the RNA:RNA binding. The calculated energy values were made based on a simple helix without taking into account any possible secondary structure features. Considering the different AGO species, alternative experimental setup, modified nucleotides in the tested RNA, and the computationally predicted RNA values compared to the averaged experimental values, we believe there is considerable reason to observe differences compared to our findings.

      We have expanded our discussion on page 27 to the following:

      "An earlier examination of mRNA:miRNA binding thermodynamics by Wee and colleagues (2012) found that mouse and fly AGO2 reduce the affinity of a guide RNA for its target61. Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. The 2012 study reported different average affinities between the two AGO2 species, with the fly protein binding tighter the mouse. Following this logic, it is not unexpected that human AGO2 would have unique properties compared to those of fly and mouse."

      The authors concluded that the range of binary complex affinities is constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders. This may hold true for miR-34a, but it cannot be generalized. Other miRNAs need to be tested.

      That is true, we have now adjusted the wording to encompass this more clearly, shown below. Testing of further miRNAs is the likely content of future work from us and others.

      "Our data indicate that the range of miR-34a binary complex affinities is instead constricted by human AGO2 in the ternary complex - strengthening weak binders while weakening strong binders."

      Minor comments:

      (Figure S2) Why was the 34-nt 3'Cy3-labeled miR34a complementary probe shifted up in the presence of AGO?

      We believe this observation is also indicative of duplex release. At the time that these activity assays were collected, we were not as aware of the presence of duplex release so did not test it further, assuming it may be due to transient interactions. We plan to investigate this via EMSA and have included this in the planned revisions (section 2).

      2.(Page 17) Does the Cy3 affect the interaction of the 3' end of miR-34 with AGO2?

      miR-34a-3'Cy5 was used for binary experiments only and the reverse experiment was conducted as a control (where Cy5 was located on the mRNA) (Figure S3b), showing no change in affinity/interaction when the probe was switched to the target. For ternary experiments the mRNA target was labelled on the 5' terminus, to make sure there was no interference with loading miR-34a into AGO2.

      A Cy3 labelled RNA probe (fully complementary to miR-34a) was used to detect miR-34a in northern blots, but AGO2 interaction is not relevant here under denaturing conditions.

      Otherwise, the 34-nt slicing probe had Cy3 on the 5 nt 3' overhang and should therefore not interact with AGO.

      1. Several groups reported that overproduced AGOs loaded endogenous small RNAs. The authors should mention that their purified AGO2 was not as pure as a RISC with miR-34a. Otherwise, readers might think that the authors used a specific RISC.

      We have now improved our explanation of the loading efficiency to make it more clear to the reader that our AGO2 sample was not fully bound by miR-34a, and that all concentrations refer to the miR-34a-loaded portion of AGO2. The following text can be found in the results on page 18:

      "The mRNA:miR-34a-AGO2 assay had a limited titration range, reaching a maximum miR-34a-AGO2 concentration of 268 nM due to a 5% loading efficiency (see Figure S2D for loading efficiency quantification). The total AGO2 concentration was thus 20-fold higher than the miR-34a-loaded portion. Further increase in protein concentration was prevented by precipitation. Weaker mRNA targets (CD44, CCND1, and NOTCH2) did not reach a saturated binding plateau within this range, leading to larger errors in their estimated KD,app values. However, reasonable estimation of the KD,app was possible by monitoring the disappearance of the free mRNA probe. Note that we refer to the miR-34a-loaded portion of AGO2 when discussing concentration values for all titration ranges. To ensure AGO2 binding specificity despite low loading efficiency, a scrambled control was used (SCRall; lacking stable base pairing with miR-34a or other human miRNAs according to the miRBase database57). SCRall showed no interaction with miR-34a-AGO2 (Figure 2B)."

      (Figure legend of Figure S5) Binding was assessed "by."

      Thank you for pointing this out, it is now fixed.

      (Page 17) It would be great if the authors could even briefly describe the mechanism by which the sodium phosphate buffer with magnesium does not disturb weaker interactions by citing reference papers.

      We have now added a supplementary methods section to our manuscript and included the description below on page 10:

      "We found that a more traditional Tris-borate-EDTA (TBE) buffer disrupted weaker RNA:RNA binding interactions (Supplementary Methods Figure M1). Borate anions form stable adducts with carbohydrate hydroxyl groups (James et al., 1996) and can form complexes with nucleic acids, likely through amino groups in nucleic bases or oxygen in phosphate groups (Stellwagen et al., 2000). This makes TBE unsuitable for assessment of RNA binding, particularly involving small RNA molecules, which typically have weaker affinities. We therefore adapted our buffer system to a sodium phosphate buffer supplemented with magnesium. Magnesium acts as a counterion to reduce electrostatic repulsion between the two negatively charged backbones by neutralisation (Misra et al., 1998)."

      We have also clarified the buffer adaptions in our results section on page 17:

      The protocol is loosely based on Bak et al. (2014)36, with major differences being use of a sodium phosphate buffering system so as not to disturb weaker interactions(James et al., 1996; Stellwagen et al., 2000), supplemented with Mg2+ as a counterion to reduce electrostatic repulsion between the two negatively charged RNAs(Misra & Draper, 1998), and fluorescently labelled probes. Original gel images and quantification are shown in supplementary Figures S3 and S4. All KD,app values are shown in Supplementary Table 1, and represent the mean of three independent replicates.

      Figure M1. Comparison of Tris-borate EDTA (TBE) and sodium phosphate with magnesium (NaP-Mg2+) buffer systems for EMSA. Cy5-labelled miR-34a and unlabelled CD44 were equilibrated in the two different buffer systems, using the same titration range. No mobility shifts were observed in the TBE system, while clear binding shifts were observed in the NaP-Mg2+ system.

      6.(Page 22) The authors cited Figure 4C in the sentence, "Comparison between CIS and TRANS ..." Is this supposed to be Figure 4D?

      The reviewer was correct in their assumption, and this has now been corrected.

      7.(Figure 6) Readers would appreciate it if the guide and target were colored in red and blue. The color codes have been used in most papers reporting AGO structures. The current color codes are opposite.

      We have now adjusted the colour schemes throughout the manuscript, and Figure 6 has been modified to the following:

      __"Figure 6. The miRNA-bulge structure is readily accommodated by AGO2 as shown by molecular dynamics simulation. __Panel (A) displays a snapshot of the all-atom MD simulation of miR-34a (red) and NOTCH1 (blue) in AGO2. The NOTCH1:miR-34a duplex is shown with AGO2 removed for clarity and is rotated 90{degree sign} to show the miRNA bulge and bend in the duplex. This NOTCH1:miR-34a-AGO2 structure is compared with (B), which shows the crystal structure of miR-122 (orange) paired with its target (purple) via the seed and four nucleotides in the supplementary region (PDB-ID 6N4O17), and (C), which shows the crystal structure of miR-122 (orange) and its target (green) with extended 3' pairing, necessary for the TDMD-competent state (PDB-ID 6NIT19). AGO2 is depicted in grey, with the PAZ domain in green, and the N-terminal domain marked with N. The miRNA duplexes in (B) and (C) feature symmetrical 4-nucleotide internal loops, whereas the NOTCH1 structure in (A) has an asymmetrical miRNA bulge with five unpaired nucleotides on the miRNA side and a 3-nucleotide asymmetry."

      Significance

      This paper will have a significant impact on the field if seed-unpaired targets can indeed unload guide RNAs. The authors may want to validate their results very carefully.

      We thank the reviewer for recognising the significance of duplex release (or guide unloading) from AGO2. We agree that the observations should be tested rigorously and have outlined the actions we took to ensure validity in our AGO2 preparation.

      __Reviewer #3 __

      Evidence, reproducibility and clarity (Required):

      In this manuscript, the authors use a combination of biochemical, biophysical, and computational approaches to investigate the structure-function relationship of miRNA binding sites. Interestingly, they find that AGO2 weakens tight RNA:RNA binding interactions, and strengthens weaker interactions.

      Given this antagonistic role, I wonder: shouldn't there be an 'average' final binding affinity? Furthermore, if I understand correctly, not many trends were observed to correlate binding affinity with repression, etc.

      Overall, there was no 'average' final binding affinity observed, as the binary assays had a much higher maximum (NOTCH2binary affinity was within the micromolar range) skewing the mean average of the binary affinities to 657 nM, versus 111 nM for the ternary affinities. We also compare the variances of the binary and ternary affinity datasets using the F-test and found that F > F(critical one tail) and thus the variation of the two populations is unequal (binary variation is significantly larger than ternary).

      F-Test Two-Sample for Variances

      • *

      binary affinity

      ternary affinity

      Mean

      657.3

      110.971667

      Variance

      2971596.1

      24406.4012

      Observations

      12

      12

      df

      11

      11

      F

      121.754784

      P(F

      7.559E-10

      F(critical one-tail)

      2.81793047

      We agree that the overall correlation between affinity and repression was not strong, although we found a stronger correlation within the miRNA-bulge group (Figure 5C and S7C). A larger sample size of miRNA bulge-forming duplexes would be needed to test the generalizability of this observation.

      Given the context of the study - whereby structure is being investigated as a contributing factor to the interaction between the miRNA and mRNA, I find it interesting that the authors chose to use MC-fold to predict the structures of the mRNA, rather than using an experimental approach to assess / validate the structures. Thirty-seven RNAs were assessed; I think even for a subset (the 12 that were focused on in the study), the secondary structure should be validated experimentally (e.g., by chemical probing experiments, which the research group has demonstrated expertise in over the last several years). The validation should follow the in silico folding approach used to narrow down the region of interest. It is necessary to know whether an energy barrier (associated with the mRNA unfolding) has to occur prior to miRNA binding; this could help explain some of the unexplained results in the study. Indeed, the authors mention that there are many variables that influence miRNA regulation.

      Indeed, experimentally validated structures offer valuable insights that cannot be obtained solely through sequence-based predictions. This is why we opted to employ our RABS method to experimentally evaluate the binary and ternary complex binding of our 12 selected targets (as depicted in Figures 4 and S9 and discussed in the text on pages 23-24). While we (in silico) assessed all 37 RNA targets that were experimentally confirmed at the time, selecting 12 to represent both biological and predicted structural diversity, it would have been impractical to experimentally pre-assess all the targets not included in the final selection. Our in-silico assessment was designed to narrow down the regions of interest and evaluate predicted secondary structures present. The pipeline is shown in Figure 1. Details of the code used in the in-silico analysis are provided in Supplementary File 1.

      Regarding the energy of unfolding of mRNA, our constructs considered the isolated binding sites thus the effects of surrounding mRNA interactions were removed. We compared our affinities to dG as well as MFE and have now included this analysis in Figure S8A. Additionally, we have included the text on page 27-28 of the discussion:

      "Gibbs free energy (G), which is often included in targeting prediction models as a measure of stability of the miRNA:mRNA pair12,62, correlated with the log of our binary KD,app values, using ΔG values predicted by RNAcofold (R2 = 0.61). There was a weaker correlation with the free energy values derived from the minimum free energy (MFE) structures predicted by RNAcofold (R2 = 0.41) (Figure S8A). This result highlights the contribution of unfolding (in ΔG) as being an important in predicting KD. The differences between ΔG and KD,app are likely primarily due to inaccurately predicted structures used for energy calculations."

      Additionally, we assessed the free form of all mRNA targets via RABS (Figure S9) and observed that the seed of each free mRNA was available for miRNA binding (seeds of the free mRNA were not stably bound).

      Finally, when designing our luciferase plasmids we used RNAstructure (Reuter & Mathews, 2010) to check for self-folding effects which could interfere with target site binding and ensured that all plasmids were void of such effects.

      In the methods, T7 is italicized by accident in the T7 in vitro transcription section. Bacmid is sometimes written with a capital B and other times with a lower-cased b. The authors should be consistent. The concentration of TEV protease that was added (as opposed to the volume) should be described for reproducibility.

      Thank you for pointing out these overlooked points. They have now been corrected.

      In figure S2D, what is the second species in the gel on the right-hand side of the gel in the miR-34a:AGO lanes? The authors should mention this.

      We believe that the faint upper band corresponds to other longer RNA species loaded into AGO2. As AGO2 is loaded with a diversity of RNA species, it is likely that some of them may have a weak affinity for the miR-34a-complementary probe, and therefore show up on the northern blot.

      Figure S3B and S3A are referenced out of order in the text. In regard to S3A, what are the anticipated or hypothesized alternative conformations for NOTCH1, DLL1, and MTA2? There are really interesting things going on in the gels, also for HNF4a and NOTCH2. Can the authors offer some explanation for why the free RNA bands don't seem to disappear, but rather migrate slowly? Is this a new species?

      The order of the figure references have now been updated, thank you for alerting us to this.

      Figure S3A: For MTA2, the two alternative conformations are shown in Figure S9 and S10 (and shown below here, miR-34aseed marked in pink). It appears that a single conformation is favoured at high concentration (> 1 µM) while the two conformations are present at {less than or equal to} 1 µM. The RABS data for MTA2 also indicated multiple binding conformations, as the reactivity traces were inconsistent. We expect that the conformation shown on the left was most dominant within AGO2, based on the reactivity of the TRANS + AGO assays. However, we cannot exclude a possible G-quadruplex formation due to the high G content of MTA2 (shown below right).

      Regarding NOTCH1 and DLL1, a faint fluorescent shadow was observed beneath the miR-34a bound band. The RABS reactivity traces indicated a single dominant conformation for these targets, so it is possible that the lower shadow observed was due to more subtle differences in conformation, such as the opening/closing of one or a few base pairs at the terminus or bulge, (i.e. end fraying). HNF4α and NOTCH2 appear to never fully saturate the miR-34a, so a small un-bound population remains visible on the gel. For NOTCH2 this free miR-34a band appears to migrate upwards, possibly due to overloading the gel lane with excess NOTCH2 (which are not observed in the Cy5 fluorescence image).

      In the EMSA for Perfect, why does the band intensity for the bound complex increase then decrease? How many replicates were run for this? This needs to be reconciled.

      As for all EMSAs, three replicates were carried out for each mRNA target and all gels are shown in Supplementary Files 2 and 3, for the binary and ternary assays respectively.

      Uneven heat distribution across the gel can lead to bleaching of the Cy5 fluorophore. To address this, we we used a circulating cooler in our electrophoresis tank, as outlined in our methods (page 10). However, the aforementioned gel for one of thePERFECT sample replicates appears to have been evenly cooled. As the binding ratio (rather than total band volume) was used for quantification, the binding curve was unaffected, and this did not influence KD,app.

      We have now replaced the exemplary gel for PERFECT in Figure S3 with a more representative and evenly labelled gel from our replicates (Cy5 fluorescence image shown below). The binding curve for PERFECT is also shown here:

      The authors list that the RNA concentration was held constant at 10 nM; in EMSAs, the RNA concentration should be less than the binding affinity; what is the lowest concentration of protein used in the assays shown in S3A? Is this a serial dilution? It seems to me like the binding assays for MTA2, Perfect, and SRCseed might have too high of an RNA concentration. (Actually, now I see in the supplement the concentrations of proteins, and the RNA concentration is too high). Also, why is the intensity of bands for bound complex for SRCseed more intense than the free RNA?

      Why are the binding affinity error bars so large (e.g., for NOTCH2 with mir-34a) - 6 uM +/- 3 uM?

      No protein was used in the binary assays shown in Figure S3A. For the ternary assays in Figure S4, the maximum concentration of miR-34a-loaded AGO2 (miR-34a-AGO2) was 268 nM, with a serial dilution down to a minimum of 0.06 nM.

      Optimal EMSA conditions require a constant RNA concentration that is lower than the binding affinity to accurately estimate high-affinity interactions.

      For our tightest binders, such as SIRT1, we can confidently state that the KD,app is less than 10 nM, estimated at 0.4 {plus minus} 1.1 nM. Therefore, the accuracy of this estimation is reduced, and the standard deviation is larger than the estimated KD,app. As NOTCH2 bound miR-34a very weakly and did not reach a fully bound plateau, the resulting high error was expected. Consequently, we do not have the same level of certainty for extremely tight or weak binders. In this study, the relative affinities were of primary importance.

      We have included on page 18:

      As the Cy5-miR-34a concentration was fixed to 10 nM to give sufficient signal during detection, KD,app values below 10 nM have a lower confidence.

      Regarding the control samples PERFECT and SCRseed, our focus was not on determining the exact KD,app of these artificial constructs. Instead, we were primarily interested in whether they exhibited binding and under which conditions. For SCRseed, we neither adjusted the titration range nor calculated KD,app. For PERFECT, the concentration was adjusted to a lower range of 30 nM - 0.001 nM to give a relative comparison with the other tight binder SIRT1. However, further reduction in RNA concentration was not pursued, as it already fell well below the 10 nM sensitivity threshold.

      Regarding the intensity of the bound SCRseed band, we observed that the bound fluorophore often resulted in stronger intensity than for the free probe. This was observed for a number of the samples (PERFECT, BLC2, SCRseed). A previous publication reported that Cy5 is sequence dependent in DNA, that the effect is more sensitive to double-stranded DNA, and that the fluorophore is sensitive to the surrounding 5 base pairs (Kretschy, Sack and Somoza, 2016). It is likely that the same phenonenon exists in RNA.

      For MTA2, the two alternative conformations (shown in Figure S9 and S10) make assessment of KD,app more difficult. As the higher affinity conformation did not reach a fully-bound plateau before the weaker affinity conformation appeared, the binding curve plateau (where all miR-34a was bound) reflected the weaker conformation KD,app. We increased the range of titration tested by using a three-fold serial dilution, but further reduction in RNA concentration would not have been fruitful as it already dropped below well below the 10 nM sensitivity range. Therefore the MTA2 binary complex had a higher error at (944 {plus minus} 274 nM) and lower confidence.

      We then decided to run a competition assay to detect the weaker KD,app of MTA2. The assay was set up using the known binding affinity of CD44, which was labelled with Cy5 to track the reaction. MTA2 was titrated against a constant concentration of Cy5-CD44:miR-34a, and disruption of the CD44 and miR-34a binding was monitored. We fitted the data to a quadratic for competitive binding (Cheng and Prusoff., 1973) to calculate the KD,app for competitive binding, or KC,app.

      We validated our competition assay by comparing it with our direct binding assays, specifically assessing CD44 in a self-competition assay. The CD44 KC,app (168 {plus minus} 24 nM; mean and SD of three replicates) was found to be consistent with the KD,app obtained from the direct assay (165 {plus minus} 21 nM).

      As we wanted all affinity data to be directly comparable (using the same methodology), we compared the KD,app values obtained via direct assay in the manuscript. It appears that the competitive EMSA assay for MTA2 reflects the weaker affinity conformation observed in the direct assay.

      It would be very helpful if the authors wrote in the Kds in Figure 2A in green and blue (in the extra space in the plots). This would help the reader to better understand what's going on, and for me, as a reviewer, to better consider the analysis/conclusions presented by the authors.

      KD,app values are written in in green and blue in what is now Figure 2D (originally Figure 2A).

      The authors state on page 18 that 'Interestingly, however, we did not observe a correlation between binary or ternary complex affinity and seed type.' They should elaborate on why this is interesting.

      The prevailing view is that the miRNA seed type significantly influences affinity within AGO2. The largest biochemical studies of miRNA-target interactions to date, conducted by McGeary et al. (2019, 2022), used AGO-RBNS (RNA Bind-n-Seq) to reveal relative binding affinities. These studies demonstrated strong correlations between the canonical seed types and binding affinity. Therefore, we find it interesting that no such correlation was observed in our dataset (despite its small size).

      We have now added to the manuscript (page 20):

      "The largest biochemical studies of miRNA-target interactions to date (McGeary et al., 2019, 2022) used AGO-RBNS (RNA Bind-n-Seq) to extract relative binding affinities, demonstrating strong correlations between the canonical seed types and binding affinity. Therefore, it is intriguing that our dataset, despite its small size, showed no such correlation."

      Figure 2C is not referenced in the text (the authors should go back through the text to make sure everything is referenced and in order). The Kds should be listed alongside the gels in Figure 2C.

      Figure 2 has now been rearranged and updated, with KD,app values listed in what is now Figure 2D.

      Figure 3B is rather confusing to understand.

      We have now adapted Figure 3 to simplify readability. Panel B has now been moved to C, and we have introduced panel A (moved from Figure 2B). In Figure 3C (originally 3B) we have added arrows to indicate the direction of affinity change from binary to ternary complex, and moved the duplex release information to panel A. We thank the reviewer and think that the data is now much clearer.

      Figure 3. AGO2 moderates affinity by strengthening weak binders and weakening strong binders. (A) Correlation of relative mRNA:miR-34a with mRNA:miR-34aAGO2 binding affinities. No seed type correlation is observed, seeds coloured, where 8mer is pink, 7mer-m8 is turquoise, and 7-mer-A1 is mauve. The slope of the linear fit is 0.48, and intercept on the (log y)-axis is 7.11. The occurrence of miRNA duplex release from AGO2 is marked with diamonds. (B) miR-34a-mediated repression of dual luciferase reporters fused to the 12 mRNA targeting sites. Luciferase activity from HEK293T cells co-transfected with each reporter construct, miR-34a was measured 24 hours following transfection and normalised to the miR-34a-negative transfection control. Each datapoint represents the R/F ratio for an independent experiment (n=3) with standard deviations indicated. SCRseed is a scrambled seed control, SCRall is a fully scrambled control, and PERFECT is the perfect complement of miR-34a. Dotted horizontal lines represent the repression values for the 22-nucleotide seed-only controls6 for the respective seed types, in the absence of any other WC base pairing. (C) Comparison of relative target repression with relative affinity assessed by EMSA. Blue represents mRNA:miR-34a affinity (binary complex), while green represents mRNA:miR-34a-AGO2 affinity (ternary complex). Arrows indicate the direction of change in affinity upon binding within AGO2 compared to the binary complex. It is seen that AGO2 moderates affinity bi-directionally by strengthening weak binders and weakening strong binders.

      Page 20: Perfect should be italicized.

      Thank you for bringing this to our attention, this how now been adjusted.

      Have the authors considered using NMR to assess the base pair pattern formed between the miRNA:mRNA complexes (with / without AGO)? As a validation for results obtained by RABS? This could be helpful for the Asymmetric target binding section, the Ago increases flexibility section, and the three distinct structural groups section in the results. It is widely accepted that while chemical probing is insightful, results should be validated using alternative approaches. Distinguishing structural changes and protected reactivity in the presence of protein is challenging.

      NMR provides high-resolution information on RNA base-pairing patterns, allowing us to compare our RABS results for SIRT1with those obtained via NMR (Banijamali et al., 2022) for the binary complex. For SIRT1, the RNA:RNA structures identified were consistent between both methods. However, using NMR to measure RNA:RNA binding within AGO2 is challenging due to the protein's large size. Currently, there are no published complete NMR structures of RNA within AGO2. The largest solution-state NMR structures published that include AGO consist solely of the PAZ domain. Our group has been working on method development using DNP-enhanced solid-state NMR to obtain structural information within the complete AGO2 protein, but the current resolution does not allow us to fully reconstruct a complete NMR structure. We hope that in the coming years, this will be a method to evaluate RNA within AGO. This limitation highlights the advantage of RABS in providing RNA base-pairing information within the ternary complex in solution.

      Reviewer #3 (Significance (Required)):

      The work is helpful for understanding how microRNAs recognize and bind their mRNA targets, and the impact Ago has on this interaction. I think for therapeutic studies, this will be helpful for structure-based design. Especially given the three types of structures identified to be a part of the interaction.

      We thank the reviewer for their detailed remarks, especially concerning the importance of technical details the binding assays. We further thank the reviewer for recognising the potential impact of our work for rational design.

      4. Description of analyses that authors prefer not to carry out

      • *

      In response to Reviewer 2 - major comment 1, we prefer to not run an additional ion exchange purification on the AGO2 protein due to the reasoning discussed above, which is repeated here:

      We have addressed this point in three ways:

      Thank you for mentioning this crucial point which has been a focus of our controls. We have addressed this point in four ways:

      Salt wash during reverse IMAC purification. Separation of unbound RNA and proteins via SEC. Blocking non-specific interactions using polyuridine. Observing both the presence and absence of duplex release among different targets using the same AGO2 preparation and conditions.

      Firstly, although we did not use a specific ion exchange column for purification, we believe the ionic strength used in our IMAC wash step was sufficient to remove non-specific interactions. We used A linear gradient with using buffer A (50 mM Tris-HCl, 300 mM NaCl, 10 mM Imidazole, 1 mM TCEP, 5% glycerol v/v) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 300 mM Imidazole, 1 mM TCEP, 5% glycerol) at pH 8. The protocol followed recommendation by BioRad for their Profinity IMAC resins where it is stated that 300 mM NaCl should be included in buffers to deter nonspecific protein binding due to ionic interactions. The protein itself has a higher affinity for the resin than nucleic acids.

      A commonly used protocol for RISC purification follows the method by Flores-Jasso et al. (RNA 2013). Here, the authors use ion exchange chromatography to remove competitor oligonucleotides. After loading, they washed the column with lysis buffer (30 mM HEPES-KOH at pH 7.4, 100 mM potassium acetate, 2 mM magnesium acetate and 2 mM DTT). AGO was eluted with lysis buffer containing 500 mM potassium acetate. Competing oligonucleotides were eluted in the wash.

      As ionic strength is independent of ion identity or chemical nature of the ion involved (Jerermy M. Berg, John L. Tymoczko, Gregory J. Garret Jr., Biochemistry 2015), we reasoned that our Tris-HCl/NaCl/ imidazole buffer wash should have at comparable ionic strength to the Flores-Jasso protocol.

      Our total ionic contributions were: 500 mM Na+, 550 mM Cl-, 50 mM Tris and 300 mM imidazole. We recognise that Tris and imidazole are both partially ionized according the pH of the buffer (pH 8) and their respective pKa values, but even if only considering the sodium and chloride it should be comparable to the Flores-Jasso protocol.

      Secondly, after reverse HisTrap purification, AGO2 was run through size exclusion chromatography to remove any remaining impurities (shown Figure S2B).

      Thirdly, knowing that AGO2 has many positively charged surface patches and can bind nucleic acid nonspecifically (Nakanishi, 2022; O'Geen et al., 2018), we tested various blocking backgrounds to eliminate nonspecific binding effects in our EMSA ternary binding assays. We were able to address this issue by adding either non-homogenous RNA extract or homogenous polyuridine (pU) in our EMSA buffer during equilibration background experiments. This allowed us to eliminate non-specific binding of our target mRNAs, as shown previously in Supplementary Figure S6. We appreciate that the reviewer finds this technical detail important and have moved the panel C of figure S6 into the main results in Figure 2C, to highlight the novel conditions used and important controls needed to be performed. If miR-34a were non-specifically bound to the surface of AGO2 after washing, this blocking step would render any impact of surface-bound miR-34a negligible due to the excess of competing polyuridine (pU).

      Our EMSA results show that, using polyU, we can reduce non-specific interaction between AGO2 and RNAs that are present. And still, duplex release occurs despite the blocking step. It is therefore less likely that duplex release is caused by surface-bound miR-34a.

      Finally, the observation of distinct duplex release for certain targets, but not for others (e.g. MTA2, which bound tightly to miR-34a-AGO2 but did not exhibit duplex release; see Figure 2), argues against the possibility that the phenomenon was solely due to non-specifically bound RNA releasing from AGO2.

      In response to the reviewers statement "Since properly loaded miR-34a is never released from AGO2, it is impossible for the miR-34a loaded into AGO2 to form the binary complex (mRNA:miR-34a)" we would like to refer to the three papers, De et al. (2013) Jo MH et al. (2015), and Park JH et al. (2017), which have previously reported duplex release and collectively provide considerable evidence that miRNA can be unloaded from AGO in order to promote turnover and recycling of AGO. It is known that AGO recycling must occur, therefore there must be some mechanisms to enable release of miRNA from AGO2 to enable this. It is possible that AGO recycling proceeds via miRNA degradation (TDMD) in the cell, but in the absence of enzymes responsible for oligouridylation and degradation, the miRNA duplex may be released. As TDMD-competent mRNA targets have been observed to release the miRNA 3' tail from AGO2 (Sheu-Gruttadauria et al., 2019; Willkomm et al., 2022), there is a possible mechanistic similarity between the two processes, however, we do not have sufficient data to make any statement on this.

    1. Pour consulter le code du corrigé de l’exercice, vous pouvez cliquer sur le fichier   index.html

      J'ai tout bien fais, mais dans le corrigé, la balise orpheline pour la langue est juste apres Doctype. Je l'ai mise dans les balises de paires head. Est- ce que ca pose un probleme? Et est-ce que j'aurais mal pris mes notes du cours ?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors provide a new computational platform called Vermouth to automate topology generation, a crucial step that any biomolecular simulation starts with. Given a wide arrange of chemical structures that need to be simulated, varying qualities of structural models as inputs obtained from various sources, and diverse force fields and molecular dynamics engines employed for simulations, automation of this fundamental step is challenging, especially for complex systems and in case that there is a need to conduct high-throughput simulations in the application of computer-aided drug design (CADD). To overcome this challenge, the authors develop a programming library composed of components that carry out various types of fundamental functionalities that are commonly encountered in topological generation. These components are intended to be general for any type of molecules and not to depend on any specific force field and MD engines. To demonstrate the applicability of this library, the authors employ those components to re-assemble a pipeline called Martinize2 used in topology generation for simulations with a widely used coarse-grained model (CG) MARTINI. This pipeline can fully recapitulate the functionality of its original version Martinize but exhibit greatly enhanced generality, as confirmed by the ability of the pipeline to faithfully generate topologies for two high-complexity benchmarking sets of proteins.

      Strengths:

      The main strength of this work is the use of concepts and algorithms associated with induced subgraph in graph theory to automate several key but non-trivial steps of topology generation such as the identification of monomer residue units (MRU), the repair of input structures with missing atoms, the mapping of topologies between different resolutions, and the generation of parameters needed for describing interactions between MRUs.

      Weaknesses:

      Although the Vermouth library appears promising as a general tool for topology generation, there is insufficient information in the current manuscript and a lack of documentation that may allow users to easily apply this library. More detailed explanation of various classes such as Processor, Molecule, Mapping, ForceField etc. that are mentioned is still needed, including inputs, output and associated operations of these classes. Some simple demonstration of application of these classes would be of great help to users. The formats of internal databases used to describe reference structures and force fields may also need to be clarified. This is particularly important when the Vermouth needs to be adapted for other AA/CG force fields and other MD engines.

      We thank the reviewer for pointing out the strengths of the presented work and agree that one of the current limitations is the lack of documentation about the library. In the revision, we point more clearly to the documentation page of the Vermouth library, which contains more detailed information on the various processors. The format of the internal databases has also been added to the documentation page. Providing a simple demonstration of applications of these classes is a great suggestion, however, we believe that it is more convenient to provide those in the form of code examples in the documentation or for instance jupyter notebooks rather than in the paper itself.  

      The successful automation of the Vermouth relies on the reference structures that need to be pre-determined. In case of the study of 43 small ligands, the reference structures and corresponding mapping to MARTINIcompatible representations for all these ligands have been already defined in the M3 force field and added into the Vermouth library. However, the authors need to comment on the scenario where significantly more ligands need to be considered and other force fields need to be used as CG representations with a lack of reference structures and mapping schemes.

      We acknowledge that vermouth/martinize2 is not capable of automatically generating Martini mappings or parameters on the fly for unknown structures that are not part of the database. However, this capability is not the purpose of the program, which is rather to distribute and manage existing parameters. Unlike atomistic force fields, which frequently have automated topology builders, Martini parameters are usually obtained for a set of specific molecules at a time and benchmarked accordingly. As more parameters are obtained by researchers, they can be added to the vermouth library via the GitHub interface in a controlled manner. This process allows the database to grow and in our opinion will quickly grow beyond the currently implemented parameters. Furthermore, the API of Vermouth is set up in a way that it can easily interface with automated topology builders which are currently being developed. Hence this limitation in our view does not diminish the applicability of vermouth to high-throughput applications with many ligands. The framework is existing and works, now only more parameters have to be added.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript by Kroon, Grunewald, Marrink and coworkers present the development of Vermouth library for coarse grain assignment and parameterization and an updated version of python script, the Martinize2 program, to build Martini coarse grained (CG) models, primarily for protein systems.

      Strengths:

      In contrast to many mature and widely used tools to build all-atom (AA) models, there are few well-accepted programs for CG model constructions and parameterization. The research reported in this manuscript is among the ongoing efforts to build such tools for Martini CG modeling, with a clear goal of high-throughput simulations of complex biomolecular systems and, ultimately, whole-cell simulations. Thus, this manuscript targets a practical problem in computational biophysics. The authors see such an effort to unify operations like CG mapping, parameterization, etc. as a vital step from the software engineering perspective.

      Weaknesses:

      However, the manuscript in this shape is unclear in the scientific novelty and appears incremental upon existing methods and tools. The only "validation" (more like an example application) is to create Martini models with two protein structure sets (I-TASSER and AlphaFold). The success rate in building the models was only 73%, while the significant failure is due to incomplete AA coordinates. This suggests a dependence on the input AA models, which makes the results less attractive for high-throughput applications (for example, preparation/creation of the AA models can become the bottleneck). There seems to be an improvement in considering the protonation state and chemical modification, but convincing validation is still needed. Besides, limitations in the existing Martini models remain (like the restricted dynamics due to the elastic network, the electrostatic interactions or polarizability).

      We thank the reviewer for pointing out the strengths of the presented work, but respectfully disagree with the criticism that the presented work is only incremental upon existing methods and tools. All MD simulations of structured proteins regardless of the force field or resolution rely on a decent initial structure to produce valid results. Therefore, failure upon detection of malformed protein input structures is an essential feature for any high-throughput pipeline working with proteins, especially considering the computational cost of MD simulations. We note that programs such as the first version of Martinize generate reasonable-looking input parameters that lead to unphysical simulations and wasted CPU hours.

      The alpha-fold database for which we surveyed 200,000 structures only contained 7 problematic structures, which means that the success rate was 99% for this database. This example simply shows that users potentially have to add the step of fixing atomistic protein input structures, if they seek to run a high-throughput pipeline.

      But at least they can be assured that martinize2 will make sure to check that no issues persist.

      Furthermore, we note that the manuscript does not aim to validate or improve the existing Martini (protein) models. All example cases presented in the paper are subject to the limitations of the protein models for the reason that martinize2 is only the program to generate those parameters. Future improvements in the protein model, which are currently underway, will immediately be available through the program to the broader community.  

      Reviewer #3 (Public Review):

      Summary:

      The manuscript Kroon et al. described two algorithms, which when combined achieve high throughput automation of "martinizing" protein structures with selected protonation states and post-translational modifications.

      Strengths:

      A large scale protein simulation was attempted, showing strong evidence that authors' algorithms work smoothly.

      The authors described the algorithms in detail and shared the open-source code under Apache 2.0 license on GitHub. This allows both reproducibility of extended usefulness within the field. These algorithms are potentially impactful if the authors can address some of the issues listed below.

      We thank the reviewer for pointing out the strengths.  

      Weaknesses:

      One major caveat of the manuscript is that the authors claim their algorithms aim to "process any type of molecule or polymer, be it linear, cyclic, branched, or dendrimeric, and mixtures thereof" and "enable researchers to prepare simulation input files for arbitrary (bio)polymers". However, the examples provided by the manuscript only support one type of biopolymer, i.e. proteins. Despite the authors' recommendation of using polyply along with martinize2/vermouth, no concrete evidence has been provided to support the authors' claim. Therefore, the manuscript must be modified to either remove these claims or include new evidence.

      We acknowledge that the current manuscript is largely protein-centric. To some extent this results from the legacy of martinize version 1, which was also only used for proteins. However, to show that martinize2 also works for cyclic as well as branched molecules we implemented two additional test cases and updated formerly Figure 6 and now Figure 7. Crown ether is used as an example of a cyclic molecule whereas a small branched polyethylene molecule is a test case for branching. Needless to say both molecules are neither proteins nor biomolecules. 

      Method descriptions on Martinize2 and graph algorithms in SI should be core content of the manuscript. I argue that Figure S1 and Figure S2 are more important than Figure 3 (protonation state). I recommend the authors can make a workflow chart combining Figure S1 and S2 to explain Martinize2 and graph algorithms in main text.

      The reviewer's critique is fair. Given the already rather large manuscript, we tried to strike a balance between describing benchmark test cases, some practical usage information (e.g. the Histidine modification), and the algorithmic library side of the program. In particular, we chose to add the figure on protonation state, because how to deal with protonation states—in particular, Histidines—was amongst the top three raised issues by users on our GitHub page. Due to this large community interest, we consider the figure equally important. However, we moved Figure S1 from the Supporting Information into the manuscript and annotated the already mentioned text with the corresponding panels to more clearly illustrate the underlying procedure. 

      In Figure 3 (protonation state), the figure itself and the captions are ambiguous about whether at the end the residue is simply renamed from HIS to HIP, or if hydrogen is removed from HIP to recover HIS.

      Using either of the two routes yields the same parameters in the end, which are for the protonated Histidine. In the second route, the extra hydrogen on Histidine is detected as an additional atom and therefore a different logic flow is triggered. Atoms are never removed, but only compounded to a base block plus modification atoms. We adjusted the figure caption to point this out more clearly.  

      In "Incorporating a Ligand small-molecule Database", the authors are calling for a community effort to build a small-molecule database. Some guidance on when the current database/algorithm combination does or does not work will help the community in contributing.

      Any small molecule not part of the database will not work. However, martinize2 will quickly identify if there are missing components of the system and alert the users. At that point, the users can decide to make their files, guided by the new documentation pages. 

      A speed comparison is needed to compare Martinize2 and Martinize.

      We respectfully disagree that a speed comparison is needed. We already alerted in the manuscript discussion that martinize2 is slower, since it does more checks, is more general, and does not only implement a single protein model.

    1. As of January 2023, GitHub reported having over 100 million developers and more than 420 million repositories, including at least 28 million public repositories. It is the world's largest source code host as of June 2023.

      Github 100 million developers

    1. your code only works in CodePens and JSFiddles because those execute the JavaScript after the DOM is parsed.

      execute javascriptr after dom is parsed

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

      Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      (1) I think the authors should state how many parameters require fitting to the data vs the total number of model parameters.

      The total number of model parameters are listed in Table 1. Each parameter has, in addition, references listed for the source of data (if one exists) along with how the data were used (’C’ calculate, ’F’ fit, ’E’ estimated, or ’S’ for scaled) for the specific simulations that appear in this paper. While this is a daunting number of parameters, only a few of these parameters must be updated when modeling a new musculotendon.

      Similar to a Hill-type muscle model, at least 5 parameters are needed to fit the VEXAT model to a specific musculotendon: maximum isometric force (fiso), optimal contractile element (CE) length, pennation angle, maximum shortening velocity, and tendon slack length. However, similar to a Hill model, it is only possible to use this minimal set of parameters by making use of default values for the remaining set of parameters. The defaults we have used have been extracted from mammalian muscle (see Table 1) and may not be appropriate for modeling muscle tissue that differs widely in terms of the ratio of fast/slow twitch fibers, titin isoform, temperature, and scale.

      Even when these defaults are appropriate, variation is the rule for biological data rather than the exception. It will always be the case that the best fit can only be obtained by fitting more of the model’s parameters to additional data. Standard measurements of the active force-length relation, passive forcelength relation, and force-velocity relations are quite helpful to improve the accuracy of the model to a specific muscle. It is challenging to improve the fit of the model’s cross-bridge (XE) and titin models because the data required are so rare. The experiments of Kirsch et al., Prado et al, and Trombitas et´ al. are unique to our knowledge. However, if more data become available, it is relatively straight forward to update the model’s parameters using the methods described in Appendix B or the code that appears online (https://github.com/mjhmilla/Millard2023VexatMuscle).

      We have modified the manuscript to make it clear that, in some circumstances, the burden of parameter identification for the VEXAT model can be as low as a Hill model:

      - Section 3: last two sentences of the 2nd paragraph, found at: Page 10, column 2, lines 1-12 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      - Table 1: last two sentences of the caption, found at: Page 11 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      (2) It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      All of the experiments simulated in this work are in-situ or ex-vivo. So far the main challenges of simulating any experiment have been quite consistent across both in-situ and ex-vivo datasets: there are insufficient data to fit most model parameters to a specific specimen and, instead, defaults from the literature must be used. In an ideal case, a specimen would have roughly ten extra trials collected so that the maximum isometric force, optimal fiber length, active force-length relation, passive force-length relation (upto ≈ 0_._6_f_oM), and the force-velocity relations could be identified from measurements rather than relying on literature values. Since most lab specimens are viable for a small number of trials (with the exception of cat soleus), we don’t expect this situation to change in future.

      However, if data are available the fitting process is pretty straight forward for either in-situ or ex-vivo data: use a standard numerical method (for example non-linear least squares, or the bisection method) to adjust the model parameters to reduce the errors between simulation and experiment. The main difficulty, as described in the previous paragraph, is the availability of data to fit as many parameters as possible for a specific specimen. As such, the fitting process really varies from experiment to experiment and depends mainly on the richness of measurements taken from a specific specimen, and from the literature in general.

      Working from in-vivo data presents an entirely different set of challenges. When working with human data, for example, it’s just not possible to directly measure muscle force with tendon buckles, and so it is never completely clear how force is distributed across the many muscles that typically actuate a joint. Further, there is also uncertainty in the boundary condition of the muscle because optical motion capture markers will move with respect to the skeleton. Video fluoroscopy offers a method of improving the accuracy of measured boundary conditions, though only for a few labs due to its great expense. A final boundary condition remains impossible to measure in any case: the geometry and forces that act at the boundaries as muscle wraps over other muscles and bones. Fitting to in-vivo data are very difficult.

      While this is an interesting topic, it is tangent to our already lengthy manuscript. Since these reviews are public, we’ll leave it to the motivated reader to find this text here.

      Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model’s ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      (1) It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch.

      While many muscle physiologists are aware of the limitations of the Hill model, these limitations are not so well known among computational biomechanists. There are at least two reasons for this gap: there are few comprehensive evaluations of Hill models against several experiments, and some of the differences are quite nuanced. For example, active lengthening experiments can be replicated reasonably well using a Hill model if the lengthening is done on the ascending limb of the force length curve. Clearly the story is quite different on the descending limb as shown in Figure 9. Similarly, as Figure 8 shows, by choosing the right combination of tendon model and perturbation bandwidth it is possible to get reasonably accurate responses from the Hill model to stochastic length changes. Yet when a wide variety of perturbation bandwidths, magnitudes, and tendon models are tested it is clear that the Hill model cannot, in general, replicate the response of muscle to stochastic perturbations. For these reasons we think many of the Hill model’s drawbacks have not been clearly understood by computational biomechanists for many years now.

      (2) Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      We agree that it will be valuable to benchmark other models in the literature using the same set of experiments. Hopefully we, or perhaps others, will have the good fortune to secure research funding to continue this benchmarking work. This will, however, be quite challenging: few muscle models are accompanied by a professional-quality open-source implementation. Without such an implementation it is often impossible to reproduce published results let alone provide a fair and objective evaluation of a model.

      (3) For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening.

      The titin model described in the paper will provide an enhancement of force during a stretch-shortening cycle. This certainly would be an interesting next experiment to simulate in a future paper.

      (4) In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      We can only respond to what drives the frequency dependent stiffness in the model, though we’re quite interested in what happens physiologically. Hopefully that there are some new experiments done to examine this phenomena in the future. In the case of the model, the reasons are pretty straight forward: the formulation of Eqn. 16 is responsible for this shift.

      Equation 16 has been formulated so that the acceleration of the attachment point of the XE is driven by the force difference between the XE and a reference Hill model (numerator of the first term in Eqn. 16) which is then low pass filtered (denominator of the first term in Eqn. 16). Due to this formulation the attachment point moves less when the numerator is small, or when the differences in the numerator change rapidly and effectively become filtered out. When the attachment point moves less, more of the CE’s force output is determined by variations in the length of the XE and its stiffness.

      On the other hand, the attachment point will move when the numerator of the first term in Eqn. 16 is large, or when those differences are not short lived. When the attachment point moves to reduce the strain in the XE, the force produced by the XE’s spring-damper is reduced. As a result, the CE’s force output is less influenced by variations of the length of the XE and its stiffness.

      Reviewer #2 (Recommendations for the Authors):

      I find the clarity of the manuscript to be much improved following revision. While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction, the revised description of small length changes makes the interpretation much less confusing.

      Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established. Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      (1) While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction ...

      We have had to abstract some of the details of reality to have a model that can be used to simulate hundreds of muscles. In contrast, FiberSim produced by Kenneth Campbell’s group uses much less abstraction and might be of greater interest to you. FiberSim’s models include individual cross-bridges, titin molecules, and an explicit representation of the spatial geometry of a sarcomere. While this model is a great tool for testing muscle physiology questions through simulation, it is computationally expensive to use this model to simulate hundreds of muscles simultaneously.

      Kosta S, Colli D, Ye Q, Campbell KS. FiberSim: A flexible open-source model of myofilament-level contraction. Biophysical journal. 2022 Jan 18;121(2):175-82.https://campbell-muscle-lab.github.io/FiberSim/

      (2) Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established.

      Please see our response 1 to Reviewer # 1.

      (3) Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      Please see our response to 2 to Reviewer #1.

    1. Michael Kan. FBI: Hackers Are Compromising Legit QR Codes to Send You to Phishing Sites. PCMAG, January 2022. URL: https://www.pcmag.com/news/fbi-hackers-are-compromising-legit-qr-codes-to-send-you-to-phishing-sites (visited on 2023-12-06).

      What kind of scenarios are there where people are scanning QR codes unnecessarily or haphazardly? The only time I use QR codes are when I need to scan the menu at a restaurant or if an event or a business would like me to get to a specific website via QR code to access a survey or something of that nature. While this scam is important to note and keep in mind for the future, it seems like the plausibility of people falling for this scam is lower than other scams.

    1. Coding style is more important than I expected in the beginning. My start to software engineering started from being on the product-minded end of the spectrum and moved towards the “technical-minded” side of the spectrum.

      1000000% agree, code is basically useless if you can't give it to someone else and they can work with your code as well or figure it out.

    2. It’s the reason why when ChatGPT outputs some hogwash, it’s easier just to re-prompt it or write it from scratch yourself instead of trying to figure out the errors in its buggy code.

      I actually do use ChatGPT to debug small errors, its great for when you've been staring at code for hours and can miss a small mistake

    3. There’s a popular saying that debugging code is twice as hard as writing it.

      This is SO true! It's better to take a while meticulously writing the code in layers, making sure each respective step works before adding another layer. Otherwise debugging the whole thing is a nightmare! [ I'm in AME :') ]

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript investigates the dynamics of GC-content patterns in the 5'end of the transcription start sites (TSS) of protein-coding genes (pc-genes). The manuscript introduces a quite careful and comprehensive analysis of GC content in pc-genes in humans and other vertebrates, specially around the TSS. The result of this investigation states that "GC-content surrounding the TSS is largely influenced by patterns of recombination." (from end of Introduction)

      My main concern with this manuscript is one of causal reasoning, whether intended or not. I hope the authors can follow my reasoning bellow on how the logic sometimes seems to fail, and that they introduce changes to clarify their suggested mechanisms of action.

      The above quoted sentence form the end of the Intro is in conflict with this other sentence that appears at the end of the Abstract "the dynamics of GC-content in mammals are largely shaped by patterns of recombination". The sentence in the Intro seems to indicate that the effect is specific to TSSs, but the one in the abstract seem to indicate the opposite, that is, that the effect is ubiquitous.

      We are sorry about the lack of clarity. We have now rewritten the abstract and intro to emphasize that our results are restricted to the 5' end of genes, and that by "patterns of recombination" we mean "historic patterns of recombination".

      The observations as stated in the abstract are: "We observe that in primates and rodents, where recombination is directed away from TSSs by PRDM9, GC-content at protein-coding gene TSSs is currently undergoing mutational decay."

      If I understand the measurements described in the manuscript correctly, and the arguments around them, you seem to show that the mutational decay of GC-content in humans is independent of location (TSSS or not), as noted here (also from the abstract) "These patterns extend into the open reading frame affecting protein-coding regions, and we show that changes in GC-content due to recombination affect synonymous codon position choices at the start of the open reading frame."

      Again, we have rewritten this section to clarify these points.

      There is one more result described in the manuscript, that in my mind is very important, but it is not given the relevance that it appears to me that it has. That is presented in Figure S3G. "we concluded that GC-content at the TSS of protein-coding genes is not at equilibrium, but in decay in primates and rodents. This decay rate is similar to the decay seen in intergenic regions that have the same GC-content (Figure S3G)"

      Thus, if the decaying effect happens everywhere, how can it be related to "recombination being directed away from TSSs by PRDM9" as it is stated in the abstract and in the model described in Figure 7?

      We make the argument that the GC-peak as likely caused by past recombination events. This is based on:

      1) The change in GC-content at the TSS in Dogs and Fox, coupled to the fact that they perform recombination at the TSS

      2) That the TSS can act as a default recombination site in mice when PRDM9 is knocked out

      3) That some forms of PRDM9 allow for recombination at TSS (see Schield et al., 2020, Hoge et al. 2023, and Joseph et al., 2023) and that this is expected to cause an increase in GC-content

      We thus speculate that the GC-peak in humans and rodents was caused by past recombination at TSSs that were permitted by ancient variants of PRDM9. We further point out that PRDM9 is undergoing rapid evolution, and some of the past versions of the protein may have had this property.

      We have tried to clarify these points in the latest version of the text.

      The fact that the decay rate is similar to any other region with similar GC-content should be an indication that the effect is not related to anything having to do with TSS or recombination being directed away from TSSs by PRDM9.

      We are sorry about the lack of clarity. TSSs in humans, chimpanzees, mouse and rats are are experiencing GC-decay at the same rate as in non-functional DNA regions with high GC-content. Thus the GC-peak is not being maintained by selection. This is surprising, given the role that GC-content plays in gene expression. This is a critical point, and we added it to the "conclusion" section of the abstract.

      I hope these paragraphs show my confusion about the relationship between the results presented which I think are very comprehensive and their interpretation and suggested model for GC-content dynamics around TSSs in human.

      On another note, can you provided a bit more background on recombination and its mechanisms?

      We have done our best to clarify these issues.

      You seem to have confident sets of genes under high/low/med recombination. How are those determined.

      We used the recombination rates per gene provided in Pouyet et al 2017 to identify the sets of genes under low/med/high recombination. Those rates were estimated from the HapMap genetic map (Frazer et al., 2007). This is now all specified in the methods section.

      You also seem to concentrate the cause of recombination on PRDM9, please explain. Is PRDM9 the unique indicator of recombination?

      PRDM9 has been shown to be the primary determinant of where recombination occurs in the genome (Grey et al., 2011, Brick et al., 2012). This is very well established. We now reword some of the introduction to make this clear.

      specific comments


      Figure 1, it is very hard to understand the differences between the three rows. Please explain more clearly in the legend, and add more information to the figure itself.

      We altered the axis titles to make this clearer. We also label "Upsream", "Exon 1" and "Part of Intron 1" in Figure 1C, F and I, and in Figure 2C. We now spell this out in the Figure Legend.

      Figure 7, express somewhere in the figure that the y axis measures GC content.

      We now added "GC Content" to the left of the first "graph" in Figure 7.

      Figure seems to introduce a 'causal' model of GC-content dismissing (diminishing?) based on recombination being directed away from TSSs. How about the diminishing of GC-content on any other genomic regions as you have shown in Figure S3G?

      Our focus in this model, and manuscript, is on TSSs. I think that to add the dynamics of other GC-rich regions is distracting. We do not know what caused these intergenic genomic regions to be high in GC-content prior to decay. After excluding known recombination sites and TSSs, these regions are very rare in the human genome. They may be ancient recombination sites that are decaying in GC-content. However, unlike TSSs, which have some connection to recombination (i.e. data from PRDM9 knockout mice and dogs and fox), we do not have any direct or indirect evidence that these other sites were used for recombination in the past. Alternatively, there could have been some other pressure on these sites in the past to increase GC-content that we are not aware of.

      -- The title is too selective, as to the results, and it has the implication that the decay is exclusive to the surrounding of the TSSs.

      Decay of GC-content towards equilibrium is the default state for non-functional DNA. That it is occurring at the TSS is surprising, as it indicates that the GC-peak is not maintained by selection. We now state this in the paper and include this in the "conclusion" portion of the abstract.

      Reviewer #1 (Significance (Required)):

      The statistical analysis is comprehensive and robust.

      We thank the reviewer for this.

      Their model interpretation as is describe induces confusion and needs to be clarified.

      We are sorry about this. Hopefully our revised text will clear up the confusion.

      I am an expert computational biologist, I do not have a deep knowledge of sequence implications of recombination, and it would be good if the manuscript could add some more background on that.

      We thank the reviewer for their perspective, and we hope that our text changes better explain to the non-expert why our findings are so surprising. We further clarify how recombination affects DNA sequence by gBGC and some of these changes are detailed in our response to the other reviewers.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this work, the author present various analyses suggesting that GC-content in TSS of coding genes is affected by recombination. The article findings are interesting and novel and are important to our understanding of how various non-adaptive evolutionary forces shape vertebrate genome evolutionary history.

      We thank the reviewer for these kind words.

      The Methods section includes most needed details (see comments below for missing information), and the scripts and data provided online help in transparency and usability of these analyses.

      I have several comments, mostly regarding clarifications in the text and several suggestions:

      1. In introduction: CpG islands, have been shown to activate transcription (Fenouil et al., 2012) - what is known about CpG Islands is somewhat inaccurately described. It should be rephrased more accurately, e.g. - CpG Islands found near TSS are associated with robust and high expression level of genes, including genes expressed in many tissues, such as housekeeping genes.

      We thank the reviewer for that. We have rewrote this part of the introduction.

      1. The following claim (in Introduction), regarding retrogenes and their GC content is not in agreement recent analyses: "Indeed, it has been observed that these genes have elevated GC-content at their 5' ends in comparison to their intron-containing counterparts, suggesting that elevation of GC-content can be driven by positive selection to drive their efficient export (Mordstein et al., 2020). Moreover, retrogenes tend to arise from parental genes that have high GC-content at their 5'ends (Kaessmann et al.,2009)." Recent work showed that retrogenes in mouse and human are significantly depleted of CpG islands in their promoters (PMID: 37055747). This follows the notion that young genes, such as these retrogenes, have simple promoters (PMID: 30395322) with few TF binding sites and without CpGs. The two reported trends should be both mentioned with some suggestions regarding why they seem to be contrasting each other and how they can be reconciled.

      We thank the reviewer for this information. The previous report (Mordstein et al., 2020) indicated that the increase in GC-content occurs downstream of the TSS in retrogenes. Since sequences upstream of the TSS are not part of the retro-insertion, it is not surprising that GC-content may differ between the retrogene and the parental gene. That retrogenes have lower numbers of CpGs upstream of the TSS, bolsters the idea that GC-content is not required for transcription and that the GC-peak is not being maintained in most genes by purging selection.

      1. In "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." I think you forgot the reference...

      We thank the reviewer for catching this.

      1. In Results, regarding average GC content (Fig 2X): "Interestingly, this pattern is different in the nonamniotes examined, including anole lizard, coelacanth, shark and lamprey." - in lizard, it seems that the genomic average is lower (and lizards are amniotes)

      You are absolutely right. We now fix this.

      1. In Discussion, the statement: "This model is supported by findings in a recent preprint, which documents the equilibrium state of GC-content in TSS regions from numerous organisms" seems to contrast with the findings of the mentioned preprint. If "most mammals have a high GC-content equilibrium state" but still have a functional PRDM9, in the lack of evidence for functional differences between ortholog PRDM9 proteins (such as signatures for positive selection or functional assays), the authors' findings regarding the relationship between a lack of PRDM9 in canids and the trends observed in their TSS, are weakened.

      We are sorry about the confusion. We were not exactly sure what points were being commented on. 1) whether GC-content is at equilibrium for most mammals or 2) that the equilibrium state is high for most mammals despite containing PRDM9. We rewrote this sentence to clarify both issues (especially given that these concepts may not be clear to non-experts, such as the first reviewer). To answer the first potential concern, the paper in question (Joseph et al., 2023), does not show that GC-content at the TSS in mammals is at equilibrium, rather, it calculates what the equilibrium state is given the nucleotide substitution rates. In most organisms, the TSS is not at equilibrium. To answer both 1 and 2, Joseph et al., show that the equilibrium GC-content at the TSS for canids is much higher than for other mammals. They and others infer that the diversity between other mammals (where the equilibrium state is higher than humans and rodents but lower than canids) has to do with the variation between PRDM9 orthologues, however this has yet to be tested. Although the action of PRDM9 has not been evaluated in most mammals, we do point out that in snakes PRDM9 allows for some recombination at the TSS.

      1. In Methods, the ENSEMBL version (in addition of the per-species genome version) should be mentioned.

      This has been fixed.

      1. In Fig 1, it is worth clarifying in the legend that the differences between the first and second rows of panels is in the length of the plotted region.

      We have now indicated this in the figure legend.

      Reviewer #2 (Significance (Required)):

      The manuscript provides a rigorous analysis of the possible processes that have impacted the TSS GC-content during evolution. It should be of interest to a diverse set of investigators in the genomics community, since it touches on different topics including genome evolution, transcription and gene structures.

      Thank you.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This study analyzes the distribution of GC-content along genes in humans and vertebrates, and particularly the higher GC-content in the 5'-end than in the 3'-end of genes. The results suggest that this pattern is ancient in vertebrates, currently decaying in mouse and humans, and probably driven by recombination and GC-biased gene conversion. It is proposed that the 5'-3' gradient was generated during evolution when PRDM9 was less active (in which case recombination occurs mostly near transcription start sites), and decays when PRDM9 is very active, as it is currently in humans and mouse. This is a very interesting hypothesis, also corroborated by a recent, similar analysis in mammals (Joseph et al. 2023). These two preprints, which appeared around the same time, are, I think, quite novel and important. The analyses performed here are thorough and convincing. Source code and raw data sets are openly distributed. I only have a couple of minor comments and suggestions, which I hope might help improve the manuscript.

      Thank you very much for the kind words.

      A1. There has been quite some work on the 5'-3' GC-content gradient in plants (e.g. Clément et al. 2014 GBE, Ressayre et al. 2015 GBE, Brazier & Glemin 2023 biorxiv), which you might like to cite.

      Thank you for pointing out these very interesting papers, we have incorporated them into the latest version.

      A2. CpG-content and GC-content are related in various ways (e.g. see Galtier & Duret 2000 MBE, Fryxell & Moon 2005 MBE) that you might like to discuss; currently the manuscript discusses the CpG hypermutation rate as a driver of GC-content but the picture might be a bit more complex.

      Thank you for this, we have incorporated these citations.

      A3. The model introduced by this manuscript (figure 7) is dependent on the evolution of recombination determination in vertebrates and the role of PRDM9. A recent preprint by Raynaud et al (biorxiv) seems relevant to this issue.

      Thank you for pointing out this pre-print. We have added a paragraph to the discussion that mentions this work. This also initiated a conversation with the authors, and we include some "personal communications" that illuminate what is going on in teleost fish.

      Line-by-line comments

      B1. "First, highly spliced mRNAs tend to have high GC-content at their 5' ends despite the fact that it is not required for export and does not affect expression levels (Mordstein et al., 2020)" -> I do not totally understand this sentence, which seems to imply some link between splicing and export/expression, could you please clarify?

      We rewrote that sentence to make it clearer.

      B2. "mismatches will form in the heteroduplex which are typically corrected in favor of Gs and Cs over As and Ts by about 70%" -> This 70% figure is human-specific, and varies a lot among species; I know in this introduction you're mainly reviewing the human literature but since this part of the text introduces gBGC as a process maybe clarify by adding "in humans" or refrain from giving this figure?

      Thank you. This is a good point. We fixed this.

      B3. "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." -> reference missing here; actually I'm not sure you will find a good reference for this because PRDM9-dependent hotspots are so short-lived that GC-content would only respond weakly; mayber rather refer to the equilibrium GC-content (and cite, for instance, Pratto et al 2014 Science), or to high-recombining regions instead of hotspots (and you have plenty of papers to cite)?

      Thanks for this.

      B4. Paragraph starting: "PRDM9 and recombination hotspots also experience accelerated rates of evolution..." -> I would suggest removing the word "also" and moving this paragraph up, just before the sentence I'm commenting above (the one starting "Thus GC-content..."). This will justify my suggestion in comment B3 of mentioning high-recombining regions instead of hotspots, while also avoiding to have the important paragraph on recombination at TSS (the one starting "There are interesting connections...") being sandwiched between two sections on PRDM9.

      We did not move this paragraph, although we did adjust the wording slightly.

      B5. Paragraph starting "There are interesting connections..." is crucial to your discussion and might be emphasized a bit more in introduction, in my opinion. For instance, what about adding a sentence like "Also not directly relevant to humans, these observations suggest that gBGC might have played a role in shaping the observed 5'-3' GC-content gradient."

      We did not alter the structure of this paragraph but we did reword sections of it.

      1. "Interestingly, this pattern is different in the non-amniotes examined, including anole lizard, coelacanth, shark and lamprey. These organisms had clear differences in GC-content between their first exon and surrounding sequences (upstream and intronic sequences), which came close to the overall genomic GC-content." -> I'm not sure I got the point the authors are intending to make here. Also please note that lizards are amniotes.

      We thank the reviewer for catching this error, we have fixed this.

      Reviewer #3 (Significance (Required)):

      This is one of two preprints having appeared ~at the same time (the other one being the cited Joseph et al 2023), which I think are quite important and convincing regarding the role of PRDM9-dependent and PRDM9-independent recombination on GC-content evolution in vertebrates. I support publication of this preprint in a molecular evolutionary journal.

      We thank the reviewer for their kind assessment!

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study analyzes the distribution of GC-content along genes in humans and vertebrates, and particularly the higher GC-content in the 5'-end than in the 3'-end of genes. The results suggest that this pattern is ancient in vertebrates, currently decaying in mouse and humans, and probably driven by recombination and GC-biased gene conversion. It is proposed that the 5'-3' gradient hass generated during evolution when PRDM9 was less active (in which case recombination occurs mostly near transcription start sites), and decays when PRDM9 is very active, as it is currently in humans and mouse. This is a very interesting hypothesis, also corroborated by a recent, similar analysis in mammals (Joseph et al. 2023). These two preprints, which appeared around the same time, are, I think, quite novel and important. The analyses performed here are thorough and convincing. Source code and raw data sets are openly distributed. I only have a couple of minor comments and suggestions, which I hope might help improve the manuscript.

      A1. There has been quite some work on the 5'-3' GC-content gradient in plants (e.g. Clément et al. 2014 GBE, Ressayre et al. 2015 GBE, Brazier & Glemin 2023 biorxiv), which you might like to cite.

      A2. CpG-content and GC-content are related in various ways (e.g. see Galtier & Duret 2000 MBE, Fryxell & Moon 2005 MBE) that you might like to discuss; currently the manuscript discusses the CpG hypermutation rate as a driver of GC-content but the picture might be a bit more complex.

      A3. The model introduced by this manuscript (figure 7) is dependent on the evolution of recombination determination in vertebrates and the role of PRDM9. A recent preprint by Raynaud et al (biorxiv) seems relevant to this issue.

      Line-by-line comments

      B1. "First, highly spliced mRNAs tend to have high GC-content at their 5' ends despite the fact that it is not required for export and does not affect expression levels (Mordstein et al., 2020)" -> I do not totally understand this sentence, which seems to imply some link between splicing and export/expression, could you please clarify?

      B2. "mismatches will form in the heteroduplex which are typically corrected in favor of Gs and Cs over As and Ts by about 70%" -> This 70% figure is human-specific, and varies a lot among species; I know in this introduction you're mainly reviewing the human literature but since since this part of the text introduces gBGC as a process maybe clarify by adding "in humans" or refrain from giving this figure?

      B3. "Thus GC-content is expected, and is indeed observed to be higher near recombination hotspots due to gBGC (REF)." -> reference missing here; actually I'm not sure you will find a good reference for this because PRDM9-dependent hotspots are so short-lived that GC-content would only respond weakly; mayber rather refer to the equilibrium GC-content (and cite, for instance, Pratto et al 2014 Science), or to high-recombining regions instead of hotspots (and you have plenty of papers to cite)?

      B4. Paragraph starting: "PRDM9 and recombination hotspots also experience accelerated rates of evolution..." -> I would suggest removing the word "also" and moving this paragraph up, just before the sentence I'm commenting above (the one starting "Thus GC-content..."). This will justify my suggestion in comment B3 of mentioning high-recombining regions instead of hotspots, while also avoiding to have the important paragraph on recombination at TSS (the one starting "There are interesting connections...") being sandwiched between two sections on PRDM9.

      B5. Paragraph starting "There are interesting connections..." is crucial to your discussion and might be emphasized a bit more in introduction, in my opinion. For instance, what about adding a sentence like "Also not directly relevant to humans, these observations suggest that gBGC might have played a role in shaping the observed 5'-3' GC-content gradient."

      1. "Interestingly, this pattern is different in the non-amniotes examined, including anole lizard, coelacanth, shark and lamprey. These organisms had clear differences in GC-content between their first exon and surrounding sequences (upstream and intronic sequences), which came close to the overall genomic GC-content." -> I'm not sure I got the point the authors are intending to make here. Also please note that lizards are amniotes.

      Significance

      This is one of two preprints having appeared ~at the same time (the other one being the cited Joseph et al 2023), which I think are quite important and convincing regarding the role of PRDM9-dependent and PRDM9-independent recombination on GC-content evolution in vertebrates. I support publication of this preprint in a molecular evolutionary journal.

  5. Local file Local file
    1. RUN grant through JeffCo = “Tri (Jefferson) County Workforce Board”?

      Yes and Yes -- This indicates a client that received RUN services through the Tri (Jefferson) County Workforce Board. Several of the regional workforce boards had both a RUN grant and a WIG grant (slightly different). Given these frequencies, I think the best we can do is explore the data by Coaching Collaborative and Trade Association Training. However, before I ask you to do that, could you please run the frequencies by zip code? The final question on the survey is the participants' zip code. I'd like to see if there is value in looking at regional differences, but before we do that, let's see if there's enough variability to make that worth it. Thank you!

    Annotators

    1. No contexto educacional, os alunosde diferentes lugares do mundo conseguem interagir entre eles, por meio da linguagem oral, gráfica e gestual, pois os avatares possibilitam manifestações de gestos e emoções

      Eu gosto de interagir no engage, uma plataforma profissional onde os avatares têm um dress code entre o casual e o executivo, Neste ambiente eu tenho formações interactivas, com exercícios um a um, com workshops , de tal forma imerso os que me esqueço que sou em casa. Temos a possibilidade de estar com pessoas de vários países, sem custos.

      Uso, para trabalho a plataforma virtual speech para treino de entrevistas de embuscada, para treino em palco para plateias, enfim, nas áreas do public speaking e é impressionante o resultado de, após uma formação fora d emetaverso, os nossos clientes podem treinar no metaverso num role play imersivo como se na situação real se encontrassem.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Statements

      We thank all three reviewers for their time and care in reviewing our manuscript, in particular Reviewer 3 for providing a detailed critique that was very useful for planning revisions. We are grateful that all three reviewers indicate that the new genome resources presented in this work are of high-quality and address an existing knowledge gap. We are also grateful for general assessments that the manuscript is 'well-written', and the analyses 'well performed' and 'thorough'.

      We acknowledge Reviewer 3’s legitimate criticism that the assembly and annotation data is not already publicly available and would like to assure the reviewing team that we have been pressing NCBI to progress the submission status since before the preprint was submitted. We regret the delay but hope that we can resolve this issue promptly. Furthermore, as some additional fields in the REAT genome annotation are lost during the NCBI submission process, we will ensure that comprehensive annotation files are also added to Zenodo.

      Reviewer 3 also made the general comment that 'the manuscript could greatly benefit from merging the result and discussion sections' and we would naturally be happy to make this adjustment if the journal in question uses that format.

      Description of the planned revisions

      • We will follow suggestions by Reviewer 3 to improve clarity of two figures:

      Figure S9: Please use a more appropriate colour palette. It is difficult to know the copy number based on the colour gradient.

      Figure 5: Consider changing panel B for a similar version of Fig S12. I think it gives a cleaner and more general perspective of the presence of starship elements.

      • We will address the choice of LOESS versus linear regression for investigating the relationship between candidate secreted effector protein (CSEP) density and transposable element (TE) density, as queried by Reviewer 3:

      Lines 140-144: LOESS smoothing functions are based on local regressions and usually find correlations when there are very weak associations. The authors have to justify the use of this model versus a simpler and more straightforward linear regression. My suspicion is that the latter would fail to find an association. Also, there is no significance of Kendall's Tau estimate (p-value).

      We agree with the reviewer, that as we did not find an association with the more sensitive LOESS, we expect that linear regression would also not find an association, supporting our current conclusions. We will add this negative result into the text.

      • We will check for other features associated with the distribution of CSEPs, as queried by Reviewer 3:

      Lines 157-163: Was there any other feature associated with the CSEP enrichment? GC content? Repetitive content? Centromere likely localisation?

      • We will integrate TE variation into the PERMANOVA lifestyle testing, as suggested by Reviewer 3:

      Line 186: Why not to test the variation content of TEs as a factor for the PERMANOVA?

      In reviewing this suggestion, we also spotted an error in our data plotting code, and the PERMANOVA lifestyle result for all genes will be corrected from 17% to 15% in Fig. 4a. Correcting this error does not impact our ultimate results or interpretation.

      • To complement the current graphical-based assessment of approximate data normality, we will include additional tests (Shapiro-Wilk for sample sizes

      Line 743: Q-Q plots are not a formal statistical test for normality.

      • One of the main critiques from Reviewer 3 was that, although we already acknowledged low sample sizes being a limitation of this work, the manuscript could benefit from reframing with greater consideration of this factor. They also highlighted a few specific places in the text that could be rephrased in consideration of this:

      Line 267: "Multiple strains" can be misleading about the magnitude.

      Lines 305-307: The fact that there is significant copy number variation between the two GtA strains suggests that the variation in the GtA lineage has not been fully captured and that there may be an unsampled substructure. Although the authors acknowledge the need for pangenomic references, they should recognize this limitation in the sample size of their own study, especially when expressing its size as "multiple strains" (line 267).

      Lines 314-317: Again, the sample size is still very small and likely not representative. It suggests UNSAMPLED substructure even for the UK populations.

      Line 164 (and whole section): I would invite the authors to cautiously revisit the use of the terms "core", "soft core". The sample size is very low, as they themselves acknowledge, and probably not representative of the diversity of Gaeumannomyces.

      We intend to edit the text to address this, including removal of both text and figure references to ‘soft-core’ genes, as we agree the term is likely not meaningful in this case, and removing it has no bearing on the results or interpretation.

      Description of the revisions that have already been incorporated in the transferred manuscript

      • We have amended the text in a number of places for clarity/fluency as suggested by Reviewer 3:

      ii) There need to be an explicit conclusion about the differences between pathogenic Gt and non-pathogenic Gh. Somehow, this is not entirely clear and is probably only a matter of rephrasing.

      Please see new lines 477-478: ‘Regarding differences between pathogenic Gt and non-pathogenic Gh, we found that Gh has a larger overall genome size and greater number of genes.’

      Lines 309-314: The message seems a bit out of context in the paragraph.

      This is valid, these lines have now been removed.

      Lines 392-395: The idea that crop pathogenic fungi are under pressure that favours heterothallism does not take into account the multiple cases of successful pathogenic clonal lineages in which sexual reproduction is absent. This paragraph seems very speculative to me. Please rephrase it.

      Our intention here was the exact reverse, that crop pathogens are under pressure to favour homothallism (as Reviewer 3 points out, anecdotally this often seems to play out in nature). We have rephrased lines 386-390 to hopefully make our stance more explicit: 'Together, this could suggest a selective pressure towards homothallism for crop fungal pathogens, and a switch from heterothallism in Gh to homothallism in Gt and Ga may, therefore, have been a key innovation underlying lifestyle divergence between non-pathogenic Gh and pathogenic Gt and Ga.'

      Lines 463-464: Please refer to the analyses when discussing the genetic divergence.

      We have rephrased this sentence to make our intended point clearer, please see new lines 459-461: ‘If we compare Ga and Gt in terms of synteny, genome size and gene content, the magnitude of differences does not appear to be more pronounced than those between GtA and GtB.’

      • We have also fixed the following typographic errors highlighted by Reviewer 3:

      Line 399: You mean, Fig 4C?

      Line 722: You missed "trimAI"

      Lines 723-727: Missing citations for "AMAS" and RAxML-NG, "AHDR" and "OrthoFinder"

      • We have added genome-wide RIP estimates to Supplementary Table S1 as requested by Reviewer 3:

      Lines 416-422: Please provide the data related to the genome-wide estimates of RIP.

      • We have added a note clarifying that differences in overall genome size between lineages are not fully explained by differences in gene copy-number (lines 406-408: 'We should note that the total length of HCN genes was not sufficiently large to account for the overall greater genome size of GtB compared to GtA (Supplemental Table S1).') in response to a comment from Reviewer 3:

      Line 396: The difference in duplicated genes raises the question of whether there are differences in overall genome size between lineages and, if so, whether they can be explained by the presence of genes.

      • We have made an alteration to the author order and added equal second-author contributions.

      Description of analyses that authors prefer not to carry out

      • In response to our analysis regarding the absence of TE-effector compartmentalisation in this system, Reviewer 1 requested additional analyses:

      While TE enrichment is typically associated with accessory compartments, it is not a defining feature. To bolster the authors' claim, it is essential to demonstrate that there is no bias in the ratio of conserved and non-conserved genes across the genomes.

      We believe that there are two slightly different compartmentalisation concepts being somewhat conflated here – (1) the idea of compartments where TEs and virulence proteins such as effectors are significantly colocalised in comparison with the rest of the genome, and (2) the idea of compartments containing gene content that is not shared in all strains (i.e. accessory). The two may overlap – as Reviewer 2 states, accessory compartments may also be enriched with TEs – but not necessarily. We specifically address the first concept in our text, and we appreciate Reviewer 3’s response on this subject:

      There is a clear answer for the compartmentalisation question. The authors favour the idea of "one-compartment" with compelling analyses.

      We believe that the second concept of accessory compartments is shown to be irrelevant in this case from our GENESPACE results (see Fig. 2), which demonstrate that gene content is conserved, broadly syntenic even, across strains, with no clear evidence of accessory compartments or chromosomes regarding gene content. We have already acknowledged that other mechanisms of compartmentalisation beyond TE-effector colocalisation may be at play (as seen from our exploration of effector distributions biased towards telomeres, see section from line 156: ‘Although CSEPs were not broadly colocalised with TEs, we did observe that they appeared to be non-randomly distributed in some pseudochromosomes (Fig. 3a)…’).

      • Reviewer 1 questioned the statement that higher level of genome-wide RIP is consistent with lower levels of gene duplication:

      L422: Is the highest RIP rate in GtA consistent with its low levels of gene duplication? Does this suggest that duplicated sequences in GtA are no longer recognizable due to RIP mutations? This seems counterintuitive, as RIP is primarily triggered by gene duplication.

      Our understanding is that, while RIP can directly mutate coding regions, it predominantly acts on duplicated sequences within repetitive regions such as TEs (https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-02060-w), which has a knock-on effect of reducing TE-mediated gene duplication. In Neurospora crassa, where RIP was first discovered and thus the model species for much of our understanding of the process, a low number of gene duplicates has been linked to the activity of RIP (https://www.nature.com/articles/nature01554). We therefore believe the current text is reasonable.

      • Reviewer 2 stated that experimental validation of gene function is required to make clear links to lifestyle or pathogenicity:

      In my eyes, the study has two main limitations. First of all, the research only concerns genomics analyses, and therefore is rather descriptive and observational, and as such does not provide further mechanistic details into the pathogen biology and/or into pathogenesis. This is further enhanced by the lack of clear observations that discriminate particular species/lineages or life styles from others in the study. Some observations are made with respect to variations in candidate secreted effector proteins and biosynthetic gene clusters, but clear links to life style or pathogenicity are missing. To further substantiate such links, lab-based experimental work would be required.

      We agree that in an ideal world supportive wet biology gene function experimental evidence would be included. Unfortunately, transformation has not been successfully developed yet in this system (see lines 33-35: ‘There have also been considerable difficulties in producing a reliable transformation system for Gt, preventing gene disruption experiments to elucidate function (Freeman and Ward 2004).’) not for lack of trying – after 18 months of effort using all available transformation techniques and selectable markers neither Gt or Gh was transformable. Undertaking that challenge has proven to be far beyond the scope of this paper, the purpose of which was to generate and analyse high-quality genomic data, a major task in itself. We again appreciate Reviewer 3’s response to this point, agreeing that it is out of scope for this work:

      I just want to respectfully disagree with reviewer #2 about the need for more experimental laboratory work, as in my opinion it clearly goes beyond the intention and scope of the submitted work. This could be a limitation that would depend on the chosen journal and its specific format and requirements. Finally, I think it would suffice for the authors to discuss on the lack of in-depth experimental work as part of the limitations of their overall approach.

      As per the suggestion by Reviewer 3, we will add text to address the absence of in-depth experimental work within the scope of this study.

      • Reviewer 3 suggested we might 'consider including formal population differentiation estimators', however, as they previously highlighted above, our sample sizes are too small to produce reliable population-level statistics.

      • Reviewer 3 raised the disparity in the appearance of branches at the root of phylogenetic trees in various figures:

      Figure 4a (and Figs S5, S13): The depicted tree has a trichotomy at the basal node. Please correct it so Magnaporthiopsis poae is resolved as an outgroup, as in Fig. S17.

      All the trees were rooted with M. poae as the outgroup, and although it may seem counterintuitive, a trifurcation at the root is the correct outcome in the case of rerooting a bifurcating tree, please see this discussion including the developers of both leading phylogeny visualisation tools ggtree and phytools (https://www.biostars.org/p/332030/). Although it is possible to force a bifurcating tree after rooting by positioning the root along an edge, the resulting branch lengths in the tree can be misleading, and so in cases where we wanted to include meaningful branch lengths in the figure (i.e. estimated from DNA substitute rates, in Figures 4a, S5 and S13) we have not circumvented the trifurcation. In Fig S17 meaningful branch lengths have not been included and the tree only represents the topology, resulting in the appearance of bifurcation at the root.

      • Reviewer 3 suggested that the discussion on giant Starship TEs resembled more of a review:

      Lines 434-451: This section resembles more a review than a discussion of the results of the present work. This also highlights the lack of analysis on the genetic composition and putative function of the identified starship-like elements.

      The reviewer has a valid point. However, Starships are a recently discovered and thus underexplored genetic feature that readers from the wider mycology/plant pathology community may not yet be aware of. We believe it is warranted to include some additional exposition to give context for why their discovery here is novel, interesting and unexpected. We are naturally keen to investigate the make-up of the elements we have found in this lineage, however that will require a substantial amount of further work. Analysis of Starships is not trivial, for example the starfish tool is still under development and a limited number of species have been used to train it. How best to compare elements is also an active area of investigation – they are dynamic in their structure and may include genes originating from the host genome or a previous host – and for this reason we believe is out of scope to interrogate alongside the other foundational genomic data presented in this paper.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the Reviewers

      We thank the referees for their careful reading of the manuscript and their valuable suggestions for improvements.

      General Statements:

      Existing SMC-based loop extrusion models successfully predict and characterize mesoscale genome spatial organization in vertebrate organisms, providing a valuable computational tool to the genome organization and chromatin biology fields. However, to date this approach is highly limited in its application beyond vertebrate organisms. This limitation arises because existing models require knowledge of CTCF binding sites, which act as effective boundary elements, blocking loop-extruding SMC complexes and thus defining TAD boundaries. However, CTCF is the predominant boundary element only in vertebrates. On the other hand, vertebrates only contain a small proportion of species in the tree of life, while TADs are nearly universal and SMC complexes are largely conserved. Thus, there is a pressing need for loop extrusion models capable of predicting Hi-C maps in organisms beyond vertebrates.

      The conserved-current loop extrusion (CCLE) model, introduced in this manuscript, extends the quantitative application of loop extrusion models in principle to any organism by liberating the model from the lack of knowledge regarding the identities and functions of specific boundary elements. By converting the genomic distribution of loop extruding cohesin into an ensemble of dynamic loop configurations via a physics-based approach, CCLE outputs three-dimensional (3D) chromatin spatial configurations that can be manifested in simulated Hi-C maps. We demonstrate that CCLE-generated maps well describe experimental Hi-C data at the TAD-scale. Importantly, CCLE achieves high accuracy by considering cohesin-dependent loop extrusion alone, consequently both validating the loop extrusion model in general (as opposed to diffusion-capture-like models proposed as alternatives to loop extrusion) and providing evidence that cohesin-dependent loop extrusion plays a dominant role in shaping chromatin organization beyond vertebrates.

      The success of CCLE unambiguously demonstrates that knowledge of the cohesin distribution is sufficient to reconstruct TAD-scale 3D chromatin organization. Further, CCLE signifies a shifted paradigm from the concept of localized, well-defined boundary elements, manifested in the existing CTCF-based loop extrusion models, to a concept also encompassing a continuous distribution of position-dependent loop extrusion rates. This new paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript presents a mathematical model for loop extrusion called the conserved-current loop extrusion model (CCLE). The model uses cohesin ChIP-Seq data to predict the Hi-C map and shows broad agreement between experimental Hi-C maps and simulated Hi-C maps. They test the model on Hi-C data from interphase fission yeast and meiotic budding yeast. The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, which they also propose extends to other organism/situations where Ctcf is absent.

      __Response: __

      We would like to point out that the referee's interpretation of our results, namely that, "The conclusion drawn by the authors is that peaks of cohesin represent loop boundaries in these situations, ...", is an oversimplification, that we do not subscribe to. The referee's interpretation of our model is correct when there are strong, localized barriers to loop extrusion; however, the CCLE model allows for loop extrusion rates that are position-dependent and take on a range of values. The CCLE model also allows the loop extrusion model to be applied to organisms without known boundary elements. Thus, the strict interpretation of the positions of cohesin peaks to be loop boundaries overlooks a key idea to emerge from the CCLE model.

      __ Major comments:__

      1. More recent micro-C/Hi-C maps, particularly for budding yeast mitotic cells and meiotic cells show clear puncta, representative of anchored loops, which are not well recapitulated in the simulated data from this study. However, such punta are cohesin-dependent as they disappear in the absence of cohesin and are enhanced in the absence of the cohesin release factor, Wapl. For example - see the two studies below. The model is therefore missing some key elements of the loop organisation. How do the authors explain this discrepency? It would also be very useful to test whether the model can predict the increased strength of loop anchors when Wapl1 is removed and cohesin levels increase.

      Costantino L, Hsieh TS, Lamothe R, Darzacq X, Koshland D. Cohesin residency determines chromatin loop patterns. Elife. 2020 Nov 10;9:e59889. doi: 10.7554/eLife.59889. PMID: 33170773; PMCID: PMC7655110. Barton RE, Massari LF, Robertson D, Marston AL. Eco1-dependent cohesin acetylation anchors chromatin loops and cohesion to define functional meiotic chromosome domains. Elife. 2022 Feb 1;11:e74447. doi: 10.7554/eLife.74447. Epub ahead of print. PMID: 35103590; PMCID: PMC8856730.

      __Response: __

      We are perplexed by this referee comment. While we agree that puncta representing loop anchors are a feature of Hi-C maps, as noted by the referee, we would reinforce that our CCLE simulations of meiotic budding yeast (Figs. 5A and 5B of the original manuscript) demonstrate an overall excellent description of the experimental meiotic budding yeast Hi-C map, including puncta arising from loop anchors. This CCLE model-experiment agreement for meiotic budding yeast is described and discussed in detail in the original manuscript and the revised manuscript (lines 336-401).

      To further emphasize and extend this point we now also address the Hi-C of mitotic budding yeast, which was not included the original manuscript. We have now added an entire new section of the revised manuscript entitled "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" including the new Figure 6, which presents a comparison between a portion of the mitotic budding yeast Hi-C map from Costantino et al. and the corresponding CCLE simulation at 500 bp-resolution. In this case too, the CCLE model well-describes the data, including the puncta, further addressing the referee's concern that the CCLE model is missing some key elements of loop organization.

      Concerning the referee's specific comment about the role of Wapl, we note that in order to apply CCLE when Wapl is removed, the corresponding cohesin ChIP-seq in the absence of Wapl should be available. To our knowledge, such data is not currently available and therefore we have not pursued this explicitly. However, we would reinforce that as Wapl is a factor that promotes cohesin unloading, its role is already effectively represented in the optimized value for LEF processivity, which encompasses LEF lifetime. In other words, if Wapl has a substantial effect it will be captured already in this model parameter.

      1. Related to the point above, the simulated data has much higher resolution than the experimental data (1kb vs 10kb in the fission yeast dataset). Given that loop size is in the 20-30kb range, a good resolution is important to see the structural features of the chromosomes. Can the model observe these details that are averaged out when the resolution is increased?

      __Response: __

      We agree with the referee that higher resolution is preferable to low resolution. In practice, however, there is a trade-off between resolution and noise. The first experimental interphase fission yeast Hi-C data of Mizuguchi et al 2014 corresponds to 10 kb resolution. To compare our CCLE simulations to these published experimental data, as described in the original manuscript, we bin our 1-kb-resolution simulations to match the 10 kb experimental measurements. Nevertheless, CCLE can readily predict the interphase fission yeast Hi-C map at higher resolution by reducing the bin size (or, if necessary, reducing the lattice site size of the simulations themselves). In the revised manuscript, we have added comparisons between CCLE's predicted Hi-C maps and newer Micro-C data for S. pombe from Hsieh et al. (Ref. [50]) in the new Supplementary Figures 5-9. We have chosen to present these comparisons at 2 kb resolution, which is the same resolution for our meiotic budding yeast comparisons. Also included in Supplementary Figures 5-9 are comparisons between the original Hi-C maps of Mizuguchi et al. and the newer maps of Hsieh et al., binned to 10 kb resolution. Inspection of these figures shows that CCLE provides a good description of Hsieh et al.'s experimental Hi-C maps and does not reveal any major new features in the interphase fission yeast Hi-C map on the 10-100 kb scale, that were not already apparent from the Hi-C maps of Mizuguchi et al 2014. Thus, the CCLE model performs well across this range of effective resolutions.

      3. Transcription, particularly convergent has been proposed to confer boundaries to loop extrusion. Can the authors recapitulate this in their model?

      __Response: __

      In response to the suggestion of the reviewer we have now calculated the correlation between cohesin ChIP-seq and the locations of convergent gene pairs, which is now presented in Supplementary Figures 17 and 18. Accordingly, in the revised manuscript, we have added the following text to the Discussion (lines 482-498):

      "In vertebrates, CTCF defines the locations of most TAD boundaries. It is interesting to ask what might play that role in interphase S. pombe as well as in meiotic and mitotic S. cerevisiae. A number of papers have suggested that convergent gene pairs are correlated with cohesin ChIP-seq in both S. pombe [65, 66] and S. cerevisiae [66-71]. Because CCLE ties TADs to cohesin ChIP-seq, a strong correlation between cohesin ChIP-seq and convergent gene pairs would be an important clue to the mechanism of TAD formation in yeasts. To investigate this correlation, we introduce a convergent-gene variable that has a nonzero value between convergent genes and an integrated weight of unity for each convergent gene pair. Supplementary Figure 17A shows the convergent gene variable, so-defined, alongside the corresponding cohesin ChIP-seq for meiotic and mitotic S. cerevisiae. It is apparent from this figure that a peak in the ChIP-seq data is accompanied by a non-zero value of the convergent-gene variable in about 80% of cases, suggesting that chromatin looping in meiotic and mitotic S. cerevisiae may indeed be tied to convergent genes. Conversely, about 50% of convergent genes match peaks in cohesin ChIP-seq. The cross-correlation between the convergent-gene variable and the ChIP-seq of meiotic and mitotic S. cerevisiae is quantified in Supplementary Figures 17B and C. By contrast, in interphase S. pombe, cross-correlation between convergent genes and cohesin ChIP-seq in each of five considered regions is unobservably small (Supplementary Figure 18A), suggesting that convergent genes per se do not have a role in defining TAD boundaries in interphase S. pombe."

      Minor comments:

      1. In the discussion, the authors cite the fact that Mis4 binding sites do not give good prediction of the HI-C maps as evidence that Mis4 is not important for loop extrusion. This can only be true if the position of Mis4 measured by ChIP is a true reflection of Mis4 position. However, Mis4 binding to cohesin/chromatin is very dynamic and it is likely that this is too short a time scale to be efficiently cross-linked for ChIP. Conversely, extensive experimental data in vivo and in vitro suggest that stimulation of cohesin's ATPase by Mis4-Ssl3 is important for loop extrusion activity.

      __Response: __

      We apologize for the confusion on this point. We actually intended to convey that the absence of Mis4-Psc3 correlations in S. pombe suggests, from the point of view of CCLE, that Mis4 is not an integral component of loop-extruding cohesin, during the loop extrusion process itself. We agree completely that Mis4/Ssl3 is surely important for cohesin loading, and (given that cohesin is required for loop extrusion) Mis4/Ssl3 is therefore important for loop extrusion. Evidently, this part of our Discussion was lacking sufficient clarity. In response to both referees' comments, we have re-written the discussion of Mis4 and Pds5 to more carefully explain our reasoning and be more circumspect in our inferences. The re-written discussion is described below in response to Referee #2's comments.

      Nevertheless, on the topic of whether Nipbl-cohesin binding is too transient to be detected in ChIP-seq, the FRAP analysis presented by Rhodes et al. eLife 6:e30000 "Scc2/Nipbl hops between chromosomal cohesin rings after loading" indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds. As shown in the bottom panel of Supplementary Fig. 7 in the original manuscript (and the bottom panel of Supplementary Fig. 20 in the revised manuscript), there is a significant cross-correlation (~0.2) between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin can be (and in fact is) detected by ChIP-seq.

      1. *Inclusion of a comparison of this model compared to previous models (for example bottom up models) would be extremely useful. What is the improvement of this model over existing models? *

      __Response: __

      As stated in the original manuscript, as far as we are aware, "bottom up" models, that quantitatively describe the Hi-C maps of interphase fission yeast or meiotic budding yeast or, indeed, of eukaryotes other than vertebrates, do not exist. Bottom-up models would require knowledge of the relevant boundary elements (e.g. CTCF sites), which, as stated in the submitted manuscript, are generally unknown for fission yeast, budding yeast, and other non-vertebrate eukaryotes. The absence of such models is the reason that CCLE fills an important need. Since bottom-up models for cohesin loop extrusion in yeast do not exist, we cannot compare CCLE to the results of such models.

      In the revised manuscript we now explicitly compare the CCLE model to the only bottom-up type of model describing the Hi-C maps of non-vertebrate eukaryotes by Schalbetter et al. Nat. Commun. 10:4795 2019, which we did cite extensively in our original manuscript. Schalbetter et al. use cohesin ChIP-seq peaks to define the positions of loop extrusion barriers in meiotic S. cerevisiae, for which the relevant boundary elements are unknown. In their model, specifically, when a loop-extruding cohesin anchor encounters such a boundary element, it either passes through with a certain probability, as if no boundary element is present, or stops extruding completely until the cohesin unbinds and rebinds.

      In the revised manuscript we refer to this model as the "explicit barrier" model and have applied it to interphase S. pombe, using cohesin ChIP-seq peaks to define the positions of loop extrusion barriers. The corresponding simulated Hi-C map is presented in Supplementary Fig. 19 in comparison with the experimental Hi-C. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in several conditions including interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers.

      We have also added the following paragraph in the Discussion section of the manuscript to elaborate this point (lines 499-521):

      "Although 'bottom-up' models which incorporate explicit boundary elements do not exist for non-vertebrate eukaryotes, one may wonder how well such LEF models, if properly modified and applied, would perform in describing Hi-C maps with diverse features. To this end, we examined the performance of the model described in Ref. [49] in describing the Hi-C map of interphase S. cerevisiae. Reference [49] uses cohesin ChIP-seq peaks in meiotic S. cerevisiae to define the positions of loop extrusion barriers which either completely stall an encountering LEF anchor with a certain probability or let it pass. We apply this 'explicit barrier' model to interphase S. pombe, using its cohesin ChIP-seq peaks to define the positions of loop extrusion barriers, and using Ref. [49]'s best-fit value of 0.05 for the pass-through probability. Supplementary Figure 19A presents the corresponding simulated Hi-C map the 0.3-1.3 kb region of Chr 2 of interphase S. pombe in comparison with the corresponding Hi-C data. It is evident that the explicit barrier model provides a poorer description of the Hi-C data of interphase S. pombe compared to the CCLE model, as indicated by the MPR and Pearson correlation scores of 1.6489 and 0.2267, respectively. While the explicit barrier model appears capable of accurately reproducing Hi-C data with punctate patterns, typically accompanied by strong peaks in the corresponding cohesin ChIP-seq, it seems less effective in cases such as in interphase S. pombe, where the Hi-C data lacks punctate patterns and sharp TAD boundaries, and the corresponding cohesin ChIP-seq shows low-contrast peaks. The success of the CCLE model in describing the Hi-C data of both S. pombe and S. cerevisiae, which exhibit very different features, suggests that the current paradigm of localized, well-defined boundary elements may not be the only approach to understanding loop extrusion. By contrast, CCLE allows for a concept of continuous distribution of position-dependent loop extrusion rates, arising from the aggregate effect of multiple interactions between loop extrusion complexes and chromatin. This paradigm offers greater flexibility in recapitulating diverse features in Hi-C data than strictly localized loop extrusion barriers."

      Reviewer #1 (Significance (Required)):

      This simple model is useful to confirm that cohesin positions dictate the position of loops, which was predicted already and proposed in many studies. However, it should be considered a starting point as it does not faithfully predict all the features of chromatin organisation, particularly at better resolution.

      Response:

      As described in more detail above, we do not agree with the assertion of the referee that the CCLE model "does not faithfully predict all the features of chromatin organization, particularly at better resolution" and provide additional new data to support the conclusion that the CCLE model provides a much needed approach to model non-vertebrate contact maps and outperforms the single prior attempt to predict budding yeast Hi-C data using information from cohesin ChIP-seq.

      *It will mostly be of interest to those in the chromosome organisation field, working in organisms or systems that do not have ctcf. *

      __Response: __

      We agree that this work will be of special interest to researchers working on chromatin organization of non-vertebrate organisms. We would reinforce that yeast are frequently used models for the study of cohesin, condensin, and chromatin folding more generally. Indeed, in the last two months alone there are two Molecular Cell papers, one Nature Genetics paper, and one Cell Reports paper where loop extrusion in yeast models is directly relevant. We also believe, however, that the model will be of interest for the field in general as it simultaneously encompasses various scenarios that may lead to slowing down or stalling of LEFs.

      This reviewer is a cell biologist working in the chromosome organisation field, but does not have modelling experience and therefore does not have the expertise to determine if the modelling part is mathematically sound and has assumed that it is.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      __Response: __

      We agree with the referee's statement that "loop extrusion is extrusion is widely accepted, even if not universally so". We disagree with the referee that this state of affairs means that "the need to demonstrate this (i.e. loop extrusion) is questionable". On the contrary, studies that provide further compelling evidence that cohesin-based loop extrusion is the primary organizer of chromatin, such as ours, must surely be welcomed, first, in order to persuade those who remain unconvinced by the loop extrusion mechanism in general, and, secondly, because, until the present work, quantitative models of loop extrusion, capable of reproducing Hi-C maps quantitatively, in yeasts and other non-vertebrate eukaryotes have been lacking, leaving open the question of whether loop extrusion can describe Hi-C maps beyond vertebrates. CCLE has now answered that question in the affirmative. Moreover, the existence of a robust model to predict contact maps in non-vertebrate models, which are extensively used in the pursuit of research questions in chromatin biology, will be broadly enabling to the field.

      It is fundamental that if a simple, physically-plausible model/hypothesis is able to describe experimental data quantitatively, it is indeed appropriate to ascribe considerable weight to that model/hypothesis (until additional data become available to refute the model).

      How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling?

      Response:

      As noted above and in the original manuscript, we are unaware of previous quantitative modeling of cohesin-based loop extrusion and the resultant Hi-C maps in organisms that lack CTCF, namely non-vertebrate eukaryotic models such as fission yeast or budding yeast, as we apply here. As noted in the original manuscript, previous quantitative modeling of Hi-C maps based on cohesin loop extrusion and CTCF boundary elements has been convincing that loop extrusion is indeed relevant in vertebrates, but the restriction to vertebrates excludes most of the tree of life.

      Below, the referee cites two examples of loop extrusion outside of vertebrates. The one that is suggested to correspond to yeast cells (Dequeker et al. Nature 606:197 2022) actually corresponds to mouse cells, which are vertebrate cells. The other one models the Hi-C map of the prokaryote, Bacillus subtilis, based on loop extrusion of the bacterial SMC complex thought to most resemble condensin (not cohesin), subject to barriers to loop extrusion that are related to genes or involving prokaryote-specific Par proteins (Brandao et al. PNAS 116:20489 2019). We have referenced this work in the revised manuscript but would reinforce that it lacks utility in predicting the contact maps for non-vertebrate eukaryotes.

      Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species.

      Response:

      In the revised manuscript, we have replaced "suggesting that the underlying mechanism that governs loop extrusion by cohesin is identical in both species" with "suggesting loop-extruding cohesins possess similar properties in both species" (lines 367-368).

      As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?

      Response:

      A hypothetical mechanism that generates the same instantaneous loop distributions and correlations as loop extrusion would lead to the same Hi-C map as does loop extrusion. This circumstance is not confined to CCLE, but is equally applicable to previous CTCF-based loop extrusion models. It holds because Hi-C and ChIP-seq, and therefore models that seek to describe these measurements, provide a snapshot of the chromatin configuration at one instant of time.

      We would reinforce that there is no physical basis for a diffusion capture model with an approximately-exponential loop size distributions. Nevertheless, one can reasonably ask whether a physically-sensible diffusion capture model can simultaneously match cohesin ChIP-seq and Hi-C. Motivated by the referee's comment we have addressed this question and, accordingly, in the revised manuscript, we have added (1) an entire subsection entitled "Diffusion capture does not reproduce experimental interphase S. pombe Hi-C maps" (lines 303-335) and (2) Supplementary Figure 15. As we now demonstrate, the CCLE model vastly outperforms an equilibrium binding model in reproducing the experimental Hi-C maps and measured P(s).

      *2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify? *

      __Response: __

      As noted above, Hi-C and ChIP-seq both capture chromatin configuration at one instant in time. Therefore, such measurements cannot and do not provide any time-scale information, such as the loop extrusion residence time (LEF lifetime) or the mean loop extrusion rate. For this reason, neither our CCLE simulations, nor other researchers' previous simulations of loop extrusion in vertebrates with CTCF boundary elements, provide any time-scale information, because the experiments they seek to describe do not contain time-scale information. The Hi-C map simulations can and do provide information concerning the loop size, which is the product of the loop lifetime and the loop extrusion rate. Lines 304-305 of the revised manuscript include the text: "Because Hi-C and ChIP-seq both characterize chromatin configuration at a single instant of time, and do not provide any direct time-scale information, ..."

      In practice, we set the LEF lifetime to be some explicit value with arbitrary time-unit. We have added a sentence in the Methods that reads, "In practice, however, we set the LEF dissociation rate to 5e-4 time-unit-1 (equivalent to a lifetime of 2000 time-units), and the nominal LEF extrusion rate (aka \rho*L/\tau, see Supplementary Methods) can be determined from the given processivity" (lines 599-602), to clarify this point. We have also changed the terminology from "timesteps" to "LEF events" in the manuscript as the latter is more accurate for our purpose.

      1. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.

      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.

      __Response: __

      Our response to Referee 2's Comments 3. and 4. is that both in the original manuscript and in the revised manuscript we clearly delineate the assumptions underlying CCLE and we carefully assess the extent to which these assumptions are violated (lines 123-126 and 263-279 in the revised manuscript). For example, Supplementary Figure 12 shows that across the S. pombe genome as a whole, violations of the CCLE assumptions are small. Supplementary Figure 13 shows that violations are similarly small for meiotic S. cerevisiae. However, to explicitly address the concern of the referee, we have added the following sentences to the revised manuscript:

      Lines 277-279:

      "While loop extrusion in interphase S. pombe seems to well satisfy the assumptions underlying CCLE, this may not always be the case in other organisms."

      Lines 359-361:

      "In addition, the three quantities, given by Eqs. 6, 7, and 8, are distributed around zero with relatively small fluctuations (Supplementary Fig. 13), indicating that CCLE model is self-consistent in this case also."

      In the case of mitotic S. cerevisiae, Supplementary Figure 14 shows that these quantities are small for most of genomic locations, except near the cohesin ChIP-seq peaks. We ascribe these greater violations of CCLE's assumptions at the locations of cohesin peaks in part to the low processivity of mitotic cohesin in S. cerevisiae, compared to that of meiotic S. cerevisiae and interphase S. pombe, and in part to the low CCLE loop extrusion rate at the cohesin peaks. We have added a paragraph at the end of the Section "CCLE Describes TADs and Loop Configurations in Mitotic S. cerevisiae" to reflect these observations (lines 447-461).

      1. *The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data. *

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background * and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.*

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      *Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. *

      __Response: __

      As stated above, our study demonstrates that the CCLE approach is able to take as input cohesin (Psc3) ChIP-seq data and produce as output simulated Hi-C maps that well reproduce the experimental Hi-C maps of interphase S. pombe and meiotic S. cerevisiae. This result is evident from the multiple Hi-C comparison figures in both the original and the revised manuscripts. In light of this circumstance, the referee's statement that it is "questionable", that CCLE shows that cohesin distribution (as quantified by cohesin ChIP-seq) is linked to cohesin looping (as quantified by Hi-C), is demonstrably incorrect.

      However, we did not intend to suggest that Nipbl and Pds5 are not crucial for cohesin loading, as the reviewer states. Rather, our inquiries relate to a more nuanced question of whether these factors only reside at loading sites or, instead, remain as a more long-lived constituent component of the loop extrusion complex. We regret any confusion and have endeavored to clarify this point in the revised manuscript in response to Referee 2's Comment 5. as well as Referee 1's Minor Comment 1. We have now better explained how the CCLE model may offer new insight from existing ChIP-seq data in general and from Mis4/Nipbl and Pds5 ChIP-seq, in particular. Accordingly, we have followed Referee 2's advice to heavily revise the relevant section of the Discussion.

      To this end, we have removed the following text from the original manuscript:

      "The fact that the cohesin distribution along the chromatin is strongly linked to chromatin looping, as evident by the success of the CCLE model, allows for new insights into in vivo LEF composition and function. For example, recently, two single-molecule studies [37, 38] independently found that Nipbl, which is the mammalian analogue of Mis4, is an obligate component of the loop-extruding human cohesin complex. Ref. [37] also found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops. On this basis, Ref. [32] proposed that, while Nipbl-containing cohesin is responsible for loop extrusion, Pds5-containing cohesin is responsible for sister chromatid cohesion, neatly separating cohesin's two functions according to composition. However, the success of CCLE in interphase S. pombe, together with the observation that the Mis4 ChIP-seq signal is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 7) allows us to infer that Mis4 cannot be a component of loop-extruding cohesin in S. pombe. On the other hand, Pds5 is correlated with Psc3 in S. pombe (Supplementary Fig. 7) suggesting that both proteins are involved in loop-extruding cohesin, contradicting a hypothesis that Pds5 is a marker for cohesive cohesin in S. pombe. In contrast to the absence of Mis4-Psc3 correlation in S. pombe, in humans, Nipbl ChIP-seq and Smc1 ChIP-seq are correlated (Supplementary Fig. 7), consistent with Ref. [32]'s hypothesis that Nipbl can be involved in loop-extruding cohesin in humans. However, Ref. [55] showed that human Hi-C contact maps in the absence of Nipbl's binding partner, Mau2 (Ssl3 in S. pombe [56]) show clear TADs, consistent with loop extrusion, albeit with reduced long-range contacts in comparison to wild-type maps, indicating that significant loop extrusion continues in live human cells in the absence of Nipbl-Mau2 complexes. These collected observations suggest the existence of two populations of loop-extruding cohesin complexes in vivo, one that involves Nipbl-Mau2 and one that does not. Both types are present in mammals, but only Mis4-Ssl3-independent loop-extruding cohesin is present in S. pombe."

      And we have replaced it by the following text in the revised manuscript (lines 533-568):

      "As noted above, the input for our CCLE simulations of chromatin organization in S. pombe, was the ChIP-seq of Psc3, which is a component of the cohesin core complex [75]. Accordingly, Psc3 ChIP-seq represents how the cohesin core complex is distributed along the genome. In S. pombe, the other components of the cohesin core complex are Psm1, Psm3, and Rad21. Because these proteins are components of the cohesin core complex, we expect that the ChIP-seq of any of these proteins would closely match the ChIP-seq of Psc3, and would equally well serve as input for CCLE simulations of S. pombe genome organization. Supplementary Figure 20C confirms significant correlations between Psc3 and Rad21. In light of this observation, we then reason that the CCLE approach offers the opportunity to investigate whether other proteins beyond the cohesin core are constitutive components of the loop extrusion complex during the extrusion process (as opposed to cohesin loading or unloading). To elaborate, if the ChIP-seq of a non-cohesin-core protein is highly correlated with the ChIP-seq of a cohesin core protein, we can infer that the protein in question is associated with the cohesin core and therefore is a likely participant in loop-extruding cohesin, alongside the cohesin core. Conversely, if the ChIP-seq of a putative component of the loop-extruding cohesin complex is uncorrelated with the ChIP-seq of a cohesin core protein, then we can infer that the protein in question is unlikely to be a component of loop-extruding cohesin, or at most is transiently associated with it.

      For example, in S. pombe, the ChIP-seq of the cohesin regulatory protein, Pds5 [74], is correlated with the ChIP-seq of Psc3 (Supplementary Fig. 20B) and with that of Rad21 (Supplementary Fig. 20D), suggesting that Pds5 can be involved in loop-extruding cohesin in S. pombe, alongside the cohesin core proteins. Interestingly, this inference concerning fission yeast cohesin subunit, Pds5, stands in contrast to the conclusion from a recent single-molecule study [38] concerning cohesin in vertebrates. Specifically, Reference [38] found that cohesin complexes containing Pds5, instead of Nipbl, are unable to extrude loops.

      Additionally, as noted above, in S. pombe the ChIP-seq signal of the cohesin loader, Mis4, is uncorrelated with the Psc3 ChIP-seq signal (Supplementary Fig. 20A), suggesting that Mis4 is, at most, a very transient component of loop-extruding cohesin in S. pombe, consistent with its designation as a "cohesin loader". However, both References [38] and [39] found that Nipbl (counterpart of S. pombe's Mis4) is an obligate component of the loop-extruding human cohesin complex, more than just a mere cohesin loader. Although CCLE has not yet been applied to vertebrates, from a CCLE perspective, the possibility that Nipbl may be required for the loop extrusion process in humans is bolstered by the observation that in humans Nipbl ChIP-seq and Smc1 ChIP-seq show significant correlations (Supplementary Fig. 20G), consistent with Ref. [32]'s hypothesis that Nipbl is involved in loop-extruding cohesin in vertebrates. A recent theoretical model of the molecular mechanism of loop extrusion by cohesin hypothesizes that transient binding by Mis4/Nipbl is essential for permitting directional reversals and therefore for two-sided loop extrusion [41]. Surprisingly, there are significant correlations between Mis4 and Pds5 in S. pombe (Supplementary Fig. 20E), indicating Pds5-Mis4 association, outside of the cohesin core complex."

      In response to Referee 2's specific comment that "at least two studies have raised concerns about Nibpl ChIP-seq results", we note (1) that, while Hu et al. Nucleic Acids Res 43:e132 2015 present a general method for calibrating ChIP-seq results, they do not measure Mis4/Nibpl ChIP-seq, nor do they raise any specific concerns about Mis4/Nipbl ChIP-seq, and (2) that (as noted above, in response to Referee 1's comment) while the FRAP analysis presented by Rhodes et al. eLife 6:e30000 indicates that, in HeLa cells, Nipbl has a residence time bound to cohesin of about 50 seconds, nevertheless, as shown in Supplementary Fig. 20G in the revised manuscript, there is a significant cross-correlation between the Nipbl ChIP-seq and Smc1 ChIP-seq in humans, indicating that a transient association between Nipbl and cohesin is detected by ChIP-seq, the referees' concerns notwithstanding.

      We thank the referee for pointing out Schwarzer et al. Nature 551:51 2017. However, our interpretation of these data is different than the referee's. As noted in our original manuscript, Nipbl has traditionally been considered to be a cohesin loading factor. If the role of Nipbl was solely to load cohesin, then we would expect that depleting Nipbl would have a major effect on the Hi-C map, because fewer cohesins are loaded onto the chromatin. Figure 2 of Schwarzer et al. Nature 551:51 2017, shows the effect of depleting Nibpl on a vertebrate Hi-C map. Even in this case when Nibpl is absent, this figure (Figure 2 of Schwarzer et al. Nature 551:51 2017) shows that TADs persist, albeit considerably attenuated. According to the authors' own analysis associated with Fig. 2 of their paper, these attenuated TADs correspond to a smaller number of loop-extruding cohesin complexes than in the presence of Nipbl. Since Nipbl is depleted, these loop-extruding cohesins necessarily cannot contain Nipbl. Thus, the data and analysis of Schwarzer et al. Nature 551:51 2017 actually seem consistent with the existence of a population of loop-extruding cohesin complexes that do not contain Nibpl.

      Concerning the referee's comment that we cannot be sure whether Pds5 ChIP is associated with extrusive or cohesive cohesin, we note that, as explained in the manuscript, we assume that the cohesive cohesins are uniformly distributed across the genome, and therefore that peaks in the cohesin ChIP-seq are associated with loop-extruding cohesins. The success of CCLE in describing Hi-C maps justifies this assumption a posteriori. Supplementary Figure 20B shows that the ChIP-seq of Pds5 is correlated with the ChIP-seq of Psc3 in S. pombe, that is, that peaks in the ChIP-seq of Psc3, assumed to derive from loop-extruding cohesin, are accompanied by peaks in the ChIP-seq of Pds5. This is the reasoning allowing us to associate Pds5 with loop-extruding cohesin in S. pombe.

      1. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae).

      Response:

      We thank the referee for this advice. Following this advice, throughout the revised manuscript, we have replaced our original calculation of the Pearson correlation coefficient of unscaled Hi-C maps with a calculation of the Pearson correlation coefficient of rescaled Hi-C maps. Since the MPR is formed from ratios of simulated to experimental Hi-C maps, this metric is unchanged by the proposed rescaling.

      As explained in the original manuscript, we attribute the lower experiment-simulation correlation in the meiotic budding yeast Hi-C maps to the larger statistical errors of the meiotic budding yeast dataset, which arises because of its higher genomic resolution - all else being equal we can expect 25 times the counts in a 10 kb x10 kb bin as in a 2 kb x 2 kb bin. For the same reason, we expect larger statistical errors in the mitotic budding yeast dataset as well. Lower correlations for noisier data are to be expected in general.

      *7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. *

      __Response: __

      For simplicity, the present version of CCLE sets the site-dependent loop extrusion rates by assuming that the cohesin ChIP-seq signal has equal contributions from left and right anchors. Then, we carry out our simulations which subsequently allow us to examine the simulated left and right currents and their difference at every site. The distributions of normalized left-right difference currents are shown in Supplementary Figures 12B, 13B, and 14D, for interphase S. pombe, meiotic S. cerevisiae, and mitotic S. cerevisiae, respectively. They are all centered at zero with standard deviations of 0.12, 0.16, and 0.33. Thus, it emerges from our simulations that the difference current is indeed generally small.

      8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      __Response: __

      In response to Referee 1's similar comment, we have calculated the correlation between the locations of convergent genes and cohesin ChIP-seq. Supplementary Figure 18A in the revised manuscript shows that for interphase S. pombe no correlations are evident, whereas for both of meiotic and mitotic S. cerevisiae, there are significant correlations between these two quantities (Supplementary Fig. 17).

      *b) apply this methodology to vertebrate cell data *

      __Response: __

      The application of CCLE to vertebrate data is outside the scope of this paper which, as we have emphasized, has the goal of developing a model that can be robustly applied to non-vertebrate eukaryotic genomes. Nevertheless, CCLE is, in principle, applicable to all organisms in which loop extrusion by SMC complexes is the primary mechanism for chromatin spatial organization.

      1. *A Github link is provided but the code is not currently available. *

      __Response: __

      The code is now available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.

      __Response: __

      The terminology "timesteps" used in the original manuscript in fact should mean "the number of LEF events performed" in the simulation. Therefore, we have changed the terminology from "timesteps" to "LEF events".

      The choice of 15000 LEF events is empirically determined to ensure that loop extrusion steady state is achieved, for the range of parameters considered. To address the referee's concern regarding the uncertainty of achieving steady state after 15000 LEF events, we compared two loop size distributions: each distribution encompasses 1000 data points, equally separated in time, one between LEF event 15000 and 35000, and the other between LEF event 80000 and 100000. The two distributions are within-errors identical, suggesting that the loop extrusion steady state is well achieved within 15000 LEF events.

      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?

      __Response: __

      As stated in the original manuscript, the errors on \rho_c on the order of 10%-20% (for S. pombe). Thus, fits with \rho_c=0 are significantly poorer than with the best-fit values of \rho_c.

      *3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP. *

      __Response: __

      We have chosen not to do this, because we judge that the manuscript is already long enough. Figures 3A, 5D, and 6C already compare the experimental and simulated ChIP-seq, and these figures already contain more information than the figures proposed by the referee.

      1. *A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited. *

      __Response: __

      We thank the referee for pointing out this citation. We have added it to the revised manuscript.

      1. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.

      __Response: __

      As noted above, throughout the revised manuscript, we now give the Pearson correlation coefficients of scaled-by-P(s) Hi-C maps.

      1. *In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made. *

      __Response: __

      It is the former. We have modified the manuscript to clarify that LEFs "initially bind to empty, adjacent chromatin lattice sites with a binding probability, that is uniformly distributed across the genome." (lines 587-588).

      *7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"? *

      __Response: __

      Here "processivity of isolated LEFs" is defined as the processivity of one LEF without the interference (blocking) from other LEFs. We have changed "processivity of loops extruded by isolated LEFs" to "processivity of isolated LEFs" for clarity.

      1. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.

      __Response: __

      In the revised manuscript, we have added an additional sentence, and have removed the offending parentheses.

      1. *Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript. *

      __Response: __

      In the revised manuscript, we have removed mention of the "barrier parameter" from the discussion.

      1. *Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022). *

      __Response: __

      In fact, Dequeker et al. Nature 606:197 2022 concerns the role of MCM complexes in blocking cohesin loop extrusion in mouse zygotes. Mouse is a vertebrate. The sole aspect of this paper, that is associated with yeast, is the observation of cohesin blocking by the yeast MCM bound to the ARS1 replication origin site, which is inserted on a piece of lambda phage DNA. No yeast genome is used in the experiment. Therefore, the referee is mistaken to suggest that this paper models yeast genome organization.

      We thank the referee for pointing out Brandao et al. PNAS 116:20489 2019, which includes the development of a tour-de-force model of condensin-based loop extrusion in the prokaryote, Bacillus subtilis, in the presence of gene barriers to loop extrusion. To acknowledge this paper, we have changed the objectionable sentence to now read (lines 571-575):

      "... prior LEF models have been overwhelmingly limited to vertebrates, which express CTCF and where CTCF is the principal boundary element. Two exceptions, in which the LEF model was applied to non-vertebrates, are Ref. [49], discussed above, and Ref. [76] (Brandao et al.), which models the Hi-C map of the prokaryote, Bacillus subtilis, on the basis of condensin loop extrusion with gene-dependent barriers."

      *Referees cross-commenting *

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      *Reviewer #2 (Significance (Required)):

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.*

      __Response: __

      We agree with the referee that analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. We also agree with the referee that it works well as a descriptive model (of Hi-C maps in S. pombe and S. cerevisiae). Obviously, we disagree with the referee's other comments. For us, being able to describe the different-appearing Hi-C maps of interphase S. pombe (Fig. 1 and Supplementary Figures 1-9), meiotic S. cerevisiae (Fig. 5) and mitotic S. cerevisiae (Fig. 6), all with a common model with just a few fitting parameters that differ between these examples, is significant and novel. The reviewer prematurely ignores the fact that there are still debates about whether "diffusion-capture"-like model is the more dominant mechanism that shape chromatin spatial organization at the TAD-scale. Many works have argued that such models could describe TAD-scale chromatin organization, as cited in the revised manuscript (Refs. [11, 14, 15, 17, 20, 22-24, 55]). However, in contrast to the poor description of the Hi-C map using diffusion capture model (as demonstrated in the revised manuscript and Supplementary Fig. 15), the excellent experiment-simulation agreement achieved by CCLE provides compelling evidence that cohesin-based loop extrusion is indeed the primary organizer of TAD-scale chromatin.

      Importantly, CCLE provides a theoretical base for how loop extrusion models can be generalized and applied to organisms without known loop extrusion barriers. Our model also highlights that (and provides means to account for) distributed barriers that impede but do not strictly block LEFs could also impact chromatin configurations. This case might be of importance to organisms with CTCF motifs that infrequently coincide with TAD boundaries, for instance, in the case of Drosophila melanogaster. Moreover, CCLE promises theoretical descriptions of the Hi-C maps of other non-vertebrates in the future, extending the quantitative application of the LEF model across the tree of life. This too would be highly significant if successful.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Yuan et al. report on their development of an analytical model ("CCLE") for loop extrusion with genomic-position-dependent speed, with the idea of accounting for barriers to loop extrusion. They write down master equations for the probabilities of cohesin occupancy at each genomic site and obtain approximate steady-state solutions. Probabilities are governed by cohesin translocation, loading, and unloading. Using ChIP-seq data as an experimental measurement of these probabilities, they numerically fit the model parameters, among which are extruder density and processivity. Gillespie simulations with these parameters combined with a 3D Gaussian polymer model were integrated to generate simulated Hi-C maps and cohesin ChIP-seq tracks, which show generally good agreement with the experimental data. The authors argue that their modeling provides evidence that loop extrusion is the primary mechanism of chromatin organization on ~10-100 kb scales in S. pombe and S. cerevisiae.

      Major comments:

      1. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. How is the agreement of CCLE with experiments more demonstrative of loop extrusion than previous modeling? Relatedly, similar best fit values for S. pombe and S. cerevisiae might not point to a mechanistic conclusion (same "underlying mechanism" of loop extrusion), but rather to similar properties for loop-extruding cohesins in the two species. As an alternative, could a model with variable binding probability given by ChIP-seq and an exponential loop-size distribution work equally well? The stated lack of a dependence on extrusion timescale suggests that a static looping model might succeed. If not, why not?
      2. I do not understand how the loop extrusion residence time drops out. As I understand it, Eq 9 converts ChIP-seq to lattice site probability (involving N_{LEF}, which is related to \rho, and \rho_c). Then, Eqs. 3-4 derive site velocities V_n and U_n if we choose rho, L, and \tau, with the latter being the residence time. This parameter is not specified anywhere and is claimed to be unimportant. It may be true that the choice of timescale is arbitrary in this procedure, but can the authors please clarify?
      3. The assumptions in the solution and application of the CCLE model are potentially constraining to a limited number of scenarios. In particular the authors specify that current due to binding/unbinding, A_n - D_n, is small. This assumption could be problematic near loading sites (centromeres, enhancers in higher eukaryotes, etc.) (where current might be dominated by A_n and V_n), unloading sites (D_n and V_{n-1}), or strong boundaries (D_n and V_{n-1}). The latter scenario is particularly concerning because the manuscript seems to be concerned with the presence of unidentified boundaries. This is partially mitigated by the fact that the model seems to work well in the chosen examples, but the authors should discuss the limitations due to their assumptions and/or possible methods to get around these limitations.
      4. Related to the above concern, low cohesin occupancy is interpreted as a fast extrusion region and high cohesin occupancy is interpreted as a slow region. But this might not be true near cohesin loading and unloading sites.
      5. The mechanistic insight attempted in the discussion, specifically with regard to Mis4/Scc2/NIPBL and Pds5, is problematic. First, it is not clear how the discussion of Nipbl and Pds5 is connected to the CCLE method; the justification is that CCLE shows cohesin distribution is linked to cohesin looping, which is already a questionable statement (point 1) and doesn't really explain how the model offers new insight into existing Nipbl and Pds5 data.

      Furthermore, I believe that the conclusions drawn on this point are flawed, or at least, stated with too much confidence. The authors raise the curious point that Nipbl ChIP-seq does not correlate well with cohesin ChIP-seq, and use this as evidence that Nipbl is not a part of the loop-extruding complex in S. pombe, and it is not essential in humans. Aside from the molecular evidence in human Nipbl/cohesin (acknowledged by authors), there are other reasons to doubt this conclusion. First, depletion of Nipbl (rather than binding partner Mau2 as in ref 55) in mouse cells strongly inhibits TAD formation (Schwarzer et al. Nature 551:51 2017). Second, at least two studies have raised concerns about Nibpl ChIP-seq results: 1) Hu et al. Nucleic Acids Res 43:e132 2015, which shows that uncalibrated ChIP-seq can obscure the signal of protein localization throughout the genome due to the inability to distinguish from background and 2) Rhodes et al. eLife 6:e30000, which uses FRAP to show that Nipbl binds and unbinds to cohesin rapidly in human cells, which could go undetected in ChIP-seq, especially when uncalibrated. It has not been shown that these dynamics are present in yeast, but there is no reason to rule it out yet.

      Similar types of critiques could be applied to the discussion of Pds5. There is cross-correlation between Psc3 and Pds5 in S. pombe, but the authors are unable to account for whether Pds5 binding is transient and/or necessary to loop extrusion itself or, more importantly, whether Pds5 ChIP is associated with extrusive or cohesive cohesins; cross-correlation peaks at about 0.6, but note that by the authors own estimates, cohesive cohesins are approximately half of all cohesins in S. pombe (Table 3).

      Due to the above issues, I suggest that the authors heavily revise this discussion to better reflect the current experimental understanding and the limited ability to draw such conclusions based on the current CCLE model. 6. I suggest that the authors recalculate correlations for Hi-C maps using maps that are rescaled by the P(s) curves. As currently computed, most of the correlation between maps could arise from the characteristic decay of P(s) rather than smaller scale features of the contact maps. This could reduce the surprising observed correlation between distinct genomic regions in pombe (which, problematically, is higher than the observed correlation between simulation and experiment in cervisiae). 7. Please explain why the difference between right and left currents at any particular site, (R_n-L_n) / Rn+Ln, should be small. It seems easy to imagine scenarios where this might not be true, such as directional barriers like CTCF or transcribed genes. 8. Optional, but I think would greatly improve the manuscript, but can the authors: a) analyze regions of high cohesin occupancy (assumed to be slow extrusion regions) to determine if there's anything special in these regions, such as more transcriptional activity

      b) apply this methodology to vertebrate cell data 9. A Github link is provided but the code is not currently available.

      Minor Comments:

      1. Please state the simulated LEF lifetime, since the statement in the methods that 15000 timesteps are needed for equilibration of the LEF model is otherwise not meaningful. Additionally, please note that backbone length is not necessarily a good measure of steady state, since the backbone can be compacted to its steady-state value while the loop distribution continues to evolve toward its steady state.
      2. How important is the cohesive cohesin parameter in the model, e.g., how good are fits with \rho_c = 0?
      3. A nice (but non-essential) supplemental visualization might be to show a scatter of sim cohesin occupancy vs. experiment ChIP.
      4. A similar calculation of Hi-C contacts based on simulated loop extruder positions using the Gaussian chain model was previously presented in Banigan et al. eLife 9:e53558 2020, which should be cited.
      5. It is stated that simulation agreement with experiments for cerevisiae is worse in part due to variability in the experiments, with MPR and Pearson numbers for cerevisiae replicates computed for reference. But these numbers are difficult to interpret without, for example, similar numbers for duplicate pombe experiments. Again, these numbers should be generated using Hi-C maps scaled by P(s), especially in case there are systematic errors in one replicate vs. another.
      6. In the model section, it is stated that LEF binding probabilities are uniformly distributed. Did the authors mean the probability is uniform across the genome or that the probability at each site is a uniformly distributed random number? Please clarify, and if the latter, explain why this unconventional assumption was made.
      7. Supplement p4 line 86 - what is meant by "processivity of loops extruded by isolated LEFs"? "size of loops extruded by..." or "processivity of isolated LEFs"?
      8. The use of parentheticals in the caption to Table 2 is a little confusing; adding a few extra words would help.
      9. Page 12 sentence line 315-318 is difficult to understand. The barrier parameter is apparently something from ref 47 not previously described in the manuscript.
      10. Statement on p14 line 393-4 is false: prior LEF models have not been limited to vertebrates, and the authors have cited some of them here. There are also non-vertebrate examples with extrusion barriers: genes as boundaries to condensin in bacteria (Brandao et al. PNAS 116:20489 2019) and MCM complexes as boundaries to cohesin in yeast (Dequeker et al. Nature 606:197 2022).

      Referees cross-commenting

      I agree with the comments of Reviewer 1, which are interesting and important points that should be addressed.

      Significance

      Analytically approaching extrusion by treating cohesin translocation as a conserved current is an interesting approach to modeling and analysis of extrusion-based chromatin organization. It appears to work well as a descriptive model. But I think there are major questions concerning the mechanistic value of this model, possible applications of the model, the provided interpretations of the model and experiments, and the limitations of the model under the current assumptions. I am unconvinced that this analysis specifically is sufficient to demonstrate that extrusion is the primary organizer of chromatin on these scales; moreover, the need to demonstrate this is questionable, as extrusion is widely accepted, even if not universally so. It is also unclear that the minimal approach of the CCLE necessarily offers an improved physical basis for modeling extrusion, as compared to previous efforts such as ref 47, as claimed by the authors. There are also questions about significance due to possible limitations of the model (detailed above). Applying the CCLE model to identify barriers would be interesting, but is not attempted. Overall, the work presents a reasonable analytical model and numerical method, but until the major comments above are addressed and some reasonable application or mechanistic value or interpretation is presented, the overall significance is somewhat limited.

    1. Latest News Click to read more latest news

      This section makes it easy for a screen reader to help direct the reader, as there are titles and a brief summary of what the article is about. This can help the reader immediately understand the content as it is navigable by a screen reader which many readers may rely on. There are also alt descriptions by examining the HTML code.

    1. Author response:

      eLife assessment 

      This important study provides evidence for a combination of the latest generation of Oxford Nanopore Technology long reads with state-of-the art variant callers enabling bacterial variant discovery at accuracy that matches or exceeds the current "gold standard" with short reads. The evidence supporting the claims of the authors is convincing, although the inclusion of a larger number of reference genomes would further strengthen the study. The work will be of interest to anyone performing sequencing for outbreak investigations, bacterial epidemiology, or similar studies. 

      We thank the editor and reviewers for the accurate summary and positive assessment. We address the comment about increasing the number of reference genomes in the response to reviewer 2.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads). 

      Strengths: 

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate). 

      Weaknesses: 

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis). 

      We agree that this would be an informative addition to the study and will add it to the benchmarking.

      Appraisal: 

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future. 

      Thank you for the positive appraisal.

      Reviewer #2 (Public Review): 

      Summary: 

      Hall et al describe the superiority of ONT sequencing and deep learning-based variant callers to deliver higher SNP and Indel accuracy compared to previous gold-standard Illumina short-read sequencing. Furthermore, they provide recommendations for read sequencing depth and computational requirements when performing variant calling. 

      Strengths: 

      The study describes compelling data showing ONT superiority when using deep learning-based variant callers, such as Clair3, compared to Illumina sequencing. This challenges the paradigm that Illumina sequencing is the gold standard for variant calling in bacterial genomes. The authors provide evidence that homopolymeric regions, a systematic and problematic issue with ONT data, are no longer a concern in ONT sequencing. 

      Weaknesses: 

      (1) The inclusion of a larger number of reference genomes would have strengthened the study to accommodate larger variability (a limitation mentioned by the authors). 

      Our strategic selection of 14 genomes—spanning a variety of bacterial genera and species, diverse GC content, and both gram-negative and gram-positive species (including M. tuberculosis, which is neither)—was designed to robustly address potential variability in our results. Moreover, all our genome assemblies underwent rigorous manual inspection as the quality of the true genome sequences is the foundation this research is built upon. Given this, the fundamental conclusions regarding the accuracy of variant calls would likely remain unchanged with the addition of more genomes.  However, we do acknowledge that a substantially larger sample size, which is beyond the scope of this study, would enable more fine-grained analysis of species differences in error rates.

      (2) In Figure 2, there are clearly one or two samples that perform worse than others in all combinations (are always below the box plots). No information about species-specific variant calls is provided by the authors but one would like to know if those are recurrently associated with one or two species. Species-specific recommendations could also help the scientific community to choose the best sequencing/variant calling approaches.

      Thank you for highlighting this observation. The precision, recall, and F1 scores for each sample and condition can be found in Supplementary Table S4. We will investigate the samples that consistently perform below expectation to determine if this is associated with specific species, which may necessitate tailored recommendations for those species. Additionally, we will produce a species-segregated version of Figure 2 for a clearer interpretation and will place it in the supplementary materials.

      (3) The authors support that a read depth of 10x is sufficient to achieve variant calls that match or exceed Illumina sequencing. However, the standard here should be the optimal discriminatory power for clinical and public health utility (namely outbreak analysis). In such scenarios, the highest discriminatory power is always desirable and as such an F1 score, Recall and Precision that is as close to 100% as possible should be maintained (which changes the minimum read sequencing depth to at least 25x, which is the inflection point).

      We agree that the highest discriminatory power is always desirable for clinical or public health applications. In which case, 25x is probably a better minimum recommendation. However, we are also aware that there are resource-limited settings where parity with Illumina is sufficient. In these cases, 10x depth from ONT would provide sufficient data.

      The manuscript currently emphasises the latter scenario, but we will revise the text to clearly recommend 25x depth as a conservative aim in settings where resources are not a constraint, ensuring the highest possible discriminatory power for applications like outbreak analysis.

      (4) The sequencing of the samples was not performed with the same Illumina and ONT method/equipment, which could have introduced specific equipment/preparation artefacts that were not considered in the study. See for example https://academic.oup.com/nargab/article/3/1/lqab019/6193612

      To our knowledge, there is no evidence that sequencing on different ONT machines or barcoding kits leads to a difference in read characteristics or accuracy. To ensure consistency and minimise potential variability, we used the same ONT flowcells for all samples and performed basecalling on the same Nvidia A100 GPU. We will update the methods to emphasise this.

      For Illumina and ONT, the exact machines used for which samples will be added as a supplementary table. We will also add a comment about possible Illumina error rate differences in the ‘Limitations’ section of the Discussion.

      In summary, while there may be specific equipment or preparation artifacts to consider, we took steps to minimise these effects and maintain consistency across our sequencing methods.

      Reviewer #3 (Public Review): 

      Hall et al. benchmarked different variant calling methods on Nanopore reads of bacterial samples and compared the performance of Nanopore to short reads produced with Illumina sequencing. To establish a common ground for comparison, the authors first generated a variant truth set for each sample and then projected this set to the reference sequence of the sample to obtain a mutated reference. Subsequently, Hall et al. called SNPs and small indels using commonly used deep learning and conventional variant callers and compared the precision and accuracy from reads produced with simplex and duplex Nanopore sequencing to Illumina data. The authors did not investigate large structural variation, which is a major limitation of the current manuscript. It will be very interesting to see a follow-up study covering this much more challenging type of variation. 

      We fully agree that investigating structural variations (SVs) would be a very interesting and important follow-up. Identifying and generating ground truth SVs is a nontrivial task and we feel it deserves its own space and study. We hope to explore this in the future.

      In their comprehensive comparison of SNPs and small indels, the authors observed superior performance of deep learning over conventional variant callers when Nanopore reads were basecalled with the most accurate (but also computationally very expensive) model, even exceeding Illumina in some cases. Not surprisingly, Nanopore underperformed compared to Illumina when basecalled with the fastest (but computationally much less demanding) method with the lowest accuracy. The authors then investigated the surprisingly higher performance of Nanopore data in some cases and identified lower recall with Illumina short read data, particularly from repetitive regions and regions with high variant density, as the driver. Combining the most accurate Nanopore basecalling method with a deep learning variant caller resulted in low error rates in homopolymer regions, similar to Illumina data. This is remarkable, as homopolymer regions are (or, were) traditionally challenging for Nanopore sequencing. 

      Lastly, Hall et al. provided useful information on the required Nanopore read depth, which is surprisingly low, and the computational resources for variant calling with deep learning callers. With that, the authors established a new state-of-the-art for Nanopore-only variant, calling on bacterial sequencing data. Most likely these findings will be transferred to other organisms as well or at least provide a proof-of-concept that can be built upon. 

      As the authors mention multiple times throughout the manuscript, Nanopore can provide sequencing data in nearly real-time and in remote regions, therefore opening up a ton of new possibilities, for example for infectious disease surveillance. 

      However, the high-performing variant calling method as established in this study requires the computationally very expensive sup and/or duplex Nanopore basecalling, whereas the least computationally demanding method underperforms. Here, the manuscript would greatly benefit from extending the last section on computational requirements, as the authors determine the resources for the variant calling but do not cover the entire picture. This could even be misleading for less experienced researchers who want to perform bacterial sequencing at high performance but with low resources. The authors mention it in the discussion but do not make clear enough that the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required. 

      We have provided runtime benchmarks for basecalling in Supplementary Figure S16 and detailed these times in Supplementary Table S7. In addition, we state in the Results section (P10 L228-230) “Though we do note that if the person performing the variant calling has received the raw (pod5) ONT data, basecalling also needs to be accounted for, as depending on how much sequencing was done, this step can also be resource-intensive.”

      Even with super-accuracy basecalling considered, our analysis shows that variant calling remains the most resource-intensive step for Clair3, DeepVariant, FreeBayes, and NanoCaller. Therefore, the statement “the described computational resources are probably largely insufficient to perform the high-accuracy basecalling required”, is incorrect. However, we will endeavour to make the basecalling component and considerations more prominent in the Results and Discussion.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors assess the accuracy of short variant calling (SNPs and indels) in bacterial genomes using Oxford Nanopore reads generated on R10.4 flow cells from a very similar genome (99.5% ANI), examining the impact of variant caller choice (three traditional variant callers: bcftools, freebayes, and longshot, and three deep learning based variant callers: clair3, deep variant, and nano caller), base calling model (fast, hac and sup) and read depth (using both simplex and duplex reads).

      Strengths:

      Given the stated goal (analysis of variant calling for reads drawn from genomes very similar to the reference), the analysis is largely complete and results are compelling. The authors make the code and data used in their analysis available for re-use using current best practices (a computational workflow and data archived in INSDC databases or Zenodo as appropriate).

      Weaknesses:

      While the medaka variant caller is now deprecated for diploid calling, it is still widely used for haploid variant calling and should at least be mentioned (even if the mention is only to explain its exclusion from the analysis).

      Appraisal:

      The experiments the authors engaged in are well structured and the results are convincing. I expect that these results will be incorporated into "best practice" bacterial variant calling workflows in the future.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      I have trialled the package on my lab's data and it works as advertised. It was straightforward to use and did not require any special training. I am confident this is a tool that will be approachable even to users with limited computational experience. The use of artificial data to validate the approach - and to provide clear limits on applicability - is particularly helpful.

      The main limitation of the tool is that it requires the user to manually select regions. This somewhat limits the generalisability and is also more subjective - users can easily choose "nice" regions that better match with their hypothesis, rather than quantifying the data in an unbiased manner. However, given the inherent challenges in quantifying biological data, such problems are not easily circumventable.

      *

      * I have some comments to clarify the manuscript:

      1. A "straightforward installation" is mentioned. Given this is a Method paper, the means of installation should be clearly laid out.*

      __This sentence is now modified. In the revised manuscript we now describe how to install the toolset and we give the link to the toolset website if further information is needed. __On this website, we provide a full video tutorial and a user manual. The user manual is provided as a supplementary material of the manuscript.

      * It would be helpful if there was an option to generate an output with the regions analysed (i.e., a JPG image with the data and the drawn line(s) on top). There are two reasons for this: i) A major problem with user-driven quantification is accidental double counting of regions (e.g., a user quantifies a part of an image and then later quantifies the same region). ii) Allows other users to independently verify measurements at a later time.*

      We agree that it is helpful to save the analyzed regions. To answer this comment and the other two reviewers' comments pointing at a similar feature, we have now included an automatic saving of the regions of interest. The user will be able to reopen saved regions of interest using a new function we included in the new version of PatternJ.

      * 3. Related to the above point, it is highlighted that each time point would need to be analysed separately (line 361-362). It seems like it should be relatively straightforward to allow a function where the analysis line can be mapped onto the next time point. The user could then adjust slightly for changes in position, but still be starting from near the previous timepoint. Given how prevalent timelapse imaging is, this seems like (or something similar) a clear benefit to add to the software.*

      We agree that the analysis of time series images can be a useful addition. We have added the analysis of time-lapse series in the new version of PatternJ. The principles behind the analysis of time-lapse series and an example of such analysis are provided in Figure 1 - figure supplement 3 and Figure 5, with accompanying text lines 140-153 and 360-372. The analysis includes a semi-automated selection of regions of interest, which will make the analysis of such sequences more straightforward than having to draw a selection on each image of the series. The user is required to draw at least two regions of interest in two different frames, and the algorithm will automatically generate regions of interest in frames in which selections were not drawn. The algorithm generates the analysis immediately after selections are drawn by the user, which includes the tracking of the reference channel.

      * Line 134-135. The level of accuracy of the searching should be clarified here. This is discussed later in the manuscript, but it would be helpful to give readers an idea at this point what level of tolerance the software has to noise and aperiodicity.

      *

      We agree with the reviewer that a clarification of this part of the algorithm will help the user better understand the manuscript.__ We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181). __Regarding the tolerance to noise, it is difficult to estimate it a priori from the choice made at the algorithm stage, so we prefer to leave it to the validation part of the manuscript. We hope this solution satisfies the reviewer and future users.

      *

      **Referees cross-commenting**

      I think the other reviewer comments are very pertinent. The authors have a fair bit to do, but they are reasonable requests. So, they should be encouraged to do the revisions fully so that the final software tool is as useful as possible.

      Reviewer #1 (Significance (Required)):

      Developing software tools for quantifying biological data that are approachable for a wide range of users remains a longstanding challenge. This challenge is due to: (1) the inherent problem of variability in biological systems; (2) the complexity of defining clearly quantifiable measurables; and (3) the broad spread of computational skills amongst likely users of such software.

      In this work, Blin et al., develop a simple plugin for ImageJ designed to quickly and easily quantify regular repeating units within biological systems - e.g., muscle fibre structure. They clearly and fairly discuss existing tools, with their pros and cons. The motivation for PatternJ is properly justified (which is sadly not always the case with such software tools).

      Overall, the paper is well written and accessible. The tool has limitations but it is clearly useful and easy to use. Therefore, this work is publishable with only minor corrections.

      *We thank the reviewer for the positive evaluation of PatternJ and for pointing out its accessibility to the users.

      *

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      # Summary

      The authors present an ImageJ Macro GUI tool set for the quantification of one-dimensional repeated patterns that are commonly occurring in microscopy images of muscles.

      # Major comments

      In our view the article and also software could be improved in terms of defining the scope of its applicability and user-ship. In many parts the article and software suggest that general biological patterns can be analysed, but then in other parts very specific muscle actin wordings are used. We are pointing this out in the "Minor comments" sections below. We feel that the authors could improve their work by making a clear choice here. One option would be to clearly limit the scope of the tool to the analysis of actin structures in muscles. In this case we would recommend to also rename the tool, e.g. MusclePatternJ. The other option would be to make the tool about the generic analysis of one-dimensional patterns, maybe calling the tool LinePatternJ. In the latter case we would recommend to remove all actin specific wordings from the macro tool set and also the article should be in parts slightly re-written.

      *

      We agree with the reviewer that our initial manuscript used a mix of general and muscle-oriented vocabulary, which could make the use of PatternJ confusing especially outside of the muscle field. To make PatternJ useful for the largest community, we corrected the manuscript and the PatternJ toolset to provide the general vocabulary needed to make it understandable for every biologist. We modified the manuscript accordingly.

      * # Minor/detailed comments

      # Software

      We recommend considering the following suggestions for improving the software.

      ## File and folder selection dialogs

      In general, clicking on many of the buttons just opens up a file-browser dialog without any further information. For novel users it is not clear what the tool expects one to select here. It would be very good if the software could be rewritten such that there are always clear instructions displayed about which file or folder one should open for the different buttons.*

      We experienced with the current version of macOS that the file-browser dialog does not display any message; we suspect this is the issue raised by the reviewer. This is a known issue of Fiji on Mac and all applications on Mac since 2016. We provided guidelines in the user manual and on the tutorial video to correct this issue by changing a parameter in Fiji. Given the issues the reviewer had accessing the material on the PatternJ website, which we apologize for, we understand the issue raised. We added an extra warning on the PatternJ website to point at this problem and its solution. Additionally, we have limited the file-browser dialog appearance to what we thought was strictly necessary. Thus, the user will experience fewer prompts, speeding up the analysis.

      *

      ## Extract button

      The tool asks one to specify things like whether selections are drawn "M-line-to-M-line"; for users that are not experts in muscle morphology this is not understandable. It would be great to find more generally applicable formulations. *

      We agree that this muscle-oriented vocabulary can make the use of PatternJ confusing. We have now corrected the user interface to provide both general and muscle-specific vocabulary ("center-to-center or edge-to-edge (M-line-to-M-line or Z-disc-to-Z-disc)").*

      ## Manual selection accuracy

      The 1st step of the analysis is always to start from a user hand-drawn profile across intensity patterns in the image. However, this step can cause inaccuracy that varies with the shape and curve of the line profile drawn. If not strictly perpendicular to for example the M line patterns, the distance between intensity peaks will be different. This will be more problematic when dealing with non-straight and parallelly poised features in the image. If the structure is bended with a curve, the line drawn over it also needs to reproduce this curve, to precisely capture the intensity pattern. I found this limits the reproducibility and easy-usability of the software.*

      We understand the concern of the reviewer. On curved selections this will be an issue that is difficult to solve, especially on "S" curved or more complex selections. The user will have to be very careful in these situations. On non-curved samples, the issue may be concerning at first sight, but the errors go with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 5 degrees, which is visually obvious, lengths will be affected by an increase of only 0.38%. The point raised by the reviewer is important to discuss, and we therefore added a paragraph to comment on the choice of selection (lines 94-98) and a supplementary figure to help make it clear (Figure 1 - figure supplement 1).*

      ### Reproducibility

      Since the line profile drawn on the image is the first step and very essential to the entire process, it should be considered to save together with the analysis result. For example, as ImageJ ROI or ROIset files that can be re-imported, correctly positioned, and visualized in the measured images. This would greatly improve the reproducibility of the proposed workflow. In the manuscript, only the extracted features are being saved (because the save button is also just asking for a folder containing images, so I cannot verify its functionality). *

      We agree that this is a very useful and important feature. We have added ROI automatic saving. Additionally, we now provide a simplified import function of all ROIs generated with PatternJ and the automated extraction and analysis of the list of ROIs. This can be done from ROIs generated previously in PatternJ or with ROIs generated from other ImageJ/Fiji algorithms. These new features are described in the manuscript in lines 120-121 and 130-132.

      *

      ## ? button

      It would be great if that button would open up some usage instructions.

      *

      We agree with the reviewer that the "?" button can be used in a better way. We have replaced this button with a Help menu, including a simple tutorial showing a series of images detailing the steps to follow by the user, a link to the user website, and a link to our video tutorial.

      * ## Easy improvement of workflow

      I would suggest a reasonable expansion of the current workflow, by fitting and displaying 2D lines to the band or line structure in the image, that form the "patterns" the author aims to address. Thus, it extracts geometry models from the image, and the inter-line distance, and even the curve formed by these sets of lines can be further analyzed and studied. These fitted 2D lines can be also well integrated into ImageJ as Line ROI, and thus be saved, imported back, and checked or being further modified. I think this can largely increase the usefulness and reproducibility of the software.

      *

      We hope that we understood this comment correctly. We had sent a clarification request to the editor, but unfortunately did not receive an answer within the requested 4 weeks of this revision. We understood the following: instead of using our 1D approach, in which we extract positions from a profile, the reviewer suggests extracting the positions of features not as a single point, but as a series of coordinates defining its shape. If this is the case, this is a major modification of the tool that is beyond the scope of PatternJ. We believe that keeping our tool simple, makes it robust. This is the major strength of PatternJ. Local fitting will not use line average for instance, which would make the tool less reliable.

      * # Manuscript

      We recommend considering the following suggestions for improving the manuscript. Abstract: The abstract suggests that general patterns can be quantified, however the actual tool quantifies specific subtypes of one-dimensional patterns. We recommend adapting the abstract accordingly.

      *

      We modified the abstract to make this point clearer.

      * Line 58: Gray-level co-occurrence matrix (GLCM) based feature extraction and analysis approach is not mentioned nor compared. At least there's a relatively recent study on Sarcomeres structure based on GLCM feature extraction: https://github.com/steinjm/SotaTool with publication: *https://doi.org/10.1002/cpz1.462

      • *

      We thank the reviewer for making us aware of this publication. We cite it now and have added it to our comparison of available approaches.

      * Line 75: "...these simple geometrical features will address most quantitative needs..." We feel that this may be an overstatement, e.g. we can imagine that there should be many relevant two-dimensional patterns in biology?!*

      We have modified this sentence to avoid potential confusion (lines 76-77).

      • *

      • Line 83: "After a straightforward installation by the user, ...". We think it would be convenient to add the installation steps at this place into the manuscript. *

      __This sentence is now modified. We now mention how to install the toolset and we provide the link to the toolset website, if further information is needed (lines 86-88). __On the website, we provide a full video tutorial and a user manual.

      * Line 87: "Multicolor images will give a graph with one profile per color." The 'Multicolor images' here should be more precisely stated as "multi-channel" images. Multi-color images could be confused with RGB images which will be treated as 8-bit gray value (type conversion first) images by profile plot in ImageJ. *

      We agree with the reviewer that this could create some confusion. We modified "multicolor" to "multi-channel".

      * Line 92: "...such as individual bands, blocks, or sarcomeric actin...". While bands and blocks are generic pattern terms, the biological term "sarcomeric actin" does not seem to fit in this list. Could a more generic wording be found, such as "block with spike"? *

      We agree with the reviewer that "sarcomeric actin" alone will not be clear to all readers. We modified the text to "block with a central band, as often observed in the muscle field for sarcomeric actin" (lines 103-104). The toolset was modified accordingly.

      * Line 95: "the algorithm defines one pattern by having the features of highest intensity in its centre". Could this be rephrased? We did not understand what that exactly means.*

      We agree with the reviewer that this was not clear. We rewrote this paragraph (lines 101-114) and provided a supplementary figure to illustrate these definitions (Figure 1 - figure supplement 2).

      * Line 124 - 147: This part the only description of the algorithm behind the feature extraction and analysis, but not clearly stated. Many details are missing or assumed known by the reader. For example, how it achieved sub-pixel resolution results is not clear. One can only assume that by fitting Gaussian to the band, the center position (peak) thus can be calculated from continuous curves other than pixels. *

      Note that the two sentences introducing this description are "Automated feature extraction is the core of the tool. The algorithm takes multiple steps to achieve this (Fig. S2):". We were hoping this statement was clear, but the reviewer may refer to something else. We agree that the description of some of the details of the steps was too quick. We have now expanded the description where needed.

      * Line 407: We think the availability of both the tool and the code could be improved. For Fiji tools it is common practice to create an Update Site and to make the code available on GitHub. In addition, downloading the example file (https://drive.google.com/file/d/1eMazyQJlisWPwmozvyb8VPVbfAgaH7Hz/view?usp=drive_link) required a Google login and access request, which is not very convenient; in fact, we asked for access but it was denied. It would be important for the download to be easier, e.g. from GitHub or Zenodo.

      *

      We are sorry for issues encountered when downloading the tool and additional material. We thank the reviewer for pointing out these issues that limited the accessibility of our tool. We simplified the downloading procedure on the website, which does not go through the google drive interface nor requires a google account. Additionally, for the coder community the code, user manual and examples are now available from GitHub at github.com/PierreMangeol/PatternJ, and are provided as supplementary material with the manuscript. To our knowledge, update sites work for plugins but not for macro toolsets. Having experience sharing our codes with non-specialists, a classical website with a tutorial video is more accessible than more coder-oriented websites, which deter many users.

      * Reviewer #2 (Significance (Required)):

      The strength of this study is that a tool for the analysis of one-dimensional repeated patterns occurring in muscle fibres is made available in the accessible open-source platform ImageJ/Fiji. In the introduction to the article the authors provide an extensive review of comparable existing tools. Their new tool fills a gap in terms of providing an easy-to-use software for users without computational skills that enables the analysis of muscle sarcomere patterns. We feel that if the below mentioned limitations could be addressed the tool could indeed be valuable to life scientists interested in muscle patterning without computational skills.

      In our view there are a few limitations, including the accessibility of example data and tutorials at sites.google.com/view/patternj, which we had trouble to access. In addition, we think that the workflow in Fiji, which currently requires pressing several buttons in the correct order, could be further simplified and streamlined by adopting some "wizard" approach, where the user is guided through the steps.

      *As answered above, the links on the PatternJ website are now corrected. Regarding the workflow, we now provide a Help menu with:

      1. __a basic set of instructions to use the tool, __
      2. a direct link to the tutorial video in the PatternJ toolset
      3. a direct link to the website on which both the tutorial video and a detailed user manual can be found. We hope this addresses the issues raised by this reviewer.

      *Another limitation is the reproducibility of the analysis; here we recommend enabling IJ Macro recording as well as saving of the drawn line ROIs. For more detailed suggestions for improvements please see the above sections of our review. *

      We agree that saving ROIs is very useful. It is now implemented in PatternJ.

      We are not sure what this reviewer means by "enabling IJ Macro recording". The ImageJ Macro Recorder is indeed very useful, but to our knowledge, it is limited to built-in functions. Our code is open and we hope this will be sufficient for advanced users to modify the code and make it fit their needs.*

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary In this manuscript, the authors present a new toolset for the analysis of repetitive patterns in biological images named PatternJ. One of the main advantages of this new tool over existing ones is that it is simple to install and run and does not require any coding skills whatsoever, since it runs on the ImageJ GUI. Another advantage is that it does not only provide the mean length of the pattern unit but also the subpixel localization of each unit and the distributions of lengths and that it does not require GPU processing to run, unlike other existing tools. The major disadvantage of the PatternJ is that it requires heavy, although very simple, user input in both the selection of the region to be analyzed and in the analysis steps. Another limitation is that, at least in its current version, PatternJ is not suitable for time-lapse imaging. The authors clearly explain the algorithm used by the tool to find the localization of pattern features and they thoroughly test the limits of their tool in conditions of varying SNR, periodicity and band intensity. Finally, they also show the performance of PatternJ across several biological models such as different kinds of muscle cells, neurons and fish embryonic somites, as well as different imaging modalities such as brightfield, fluorescence confocal microscopy, STORM and even electron microscopy.

      This manuscript is clearly written, and both the section and the figures are well organized and tell a cohesive story. By testing PatternJ, I can attest to its ease of installation and use. Overall, I consider that PatternJ is a useful tool for the analysis of patterned microscopy images and this article is fit for publication. However, i do have some minor suggestions and questions that I would like the authors to address, as I consider they could improve this manuscript and the tool:

      *We are grateful to this reviewer for this very positive assessment of PatternJ and of our manuscript.

      * Minor Suggestions: In the methodology section is missing a more detailed description about how the metric plotted was obtained: as normalized intensity or precision in pixels. *

      We agree with the reviewer that a more detailed description of the metric plotted was missing. We added this information in the method part and added information in the Figure captions where more details could help to clarify the value displayed.

      * The validation is based mostly on the SNR and patterns. They should include a dataset of real data to validate the algorithm in three of the standard patterns tested. *

      We validated our tool using computer-generated images, in which we know with certainty the localization of patterns. This allowed us to automatically analyze 30 000 images, and with varying settings, we sometimes analyzed 10 times the same image, leading to about 150 000 selections analyzed. From these analyses, we can provide with confidence an unbiased assessment of the tool precision and the tool capacity to extract patterns. We already provided examples of various biological data images in Figures 4-6, showing all possible features that can be extracted with PatternJ. In these examples, we can claim by eye that PatternJ extracts patterns efficiently, but we cannot know how precise these extractions are because of the nature of biological data: "real" positions of features are unknown in biological data. Such validation will be limited to assessing whether a pattern was found or not, which we believe we already provided with the examples in Figures 4-6.

      * The video tutorial available in the PatternJ website is very useful, maybe it would be worth it to include it as supplemental material for this manuscript, if the journal allows it. *

      As the video tutorial may have been missed by other reviewers, we agree it is important to make it more prominent to users. We have now added a Help menu in the toolset that opens the tutorial video. Having the video as supplementary material could indeed be a useful addition if the size of the video is compatible with the journal limits.

      * An example image is provided to test the macro. However, it would be useful to provide further example images for each of the three possible standard patterns suggested: Block, actin sarcomere or individual band.*

      We agree this can help users. We now provide another multi-channel example image on the PatternJ website including blocks and a pattern made of a linear intensity gradient that can be extracted with our simpler "single pattern" algorithm, which were missing in the first example. Additionally, we provide an example to be used with our new time-lapse analysis.

      * Access to both the manual and the sample images in the PatternJ website should be made publicly available. Right now they both sit in a private Drive account. *

      As mentioned above, we apologize for access issues that occurred during the review process. These files can now be downloaded directly on the website without any sort of authentication. Additionally, these files are now also available on GitHub.

      * Some common errors are not properly handled by the macro and could be confusing for the user: When there is no selection and one tries to run a Check or Extraction: "Selection required in line 307 (called from line 14). profile=getProfile( ;". A simple "a line selection is required" message would be useful there. When "band" or "block" is selected for a channel in the "Set parameters" window, yet a 0 value is entered into the corresponding "Number of bands or blocks" section, one gets this error when trying to Extract: "Empty array in line 842 (called from line 113). if ( ( subloc . length == 1 ) & ( subloc [ 0 == 0) ) {". This error is not too rare, since the "Number of bands or blocks" section is populated with a 0 after choosing "sarcomeric actin" (after accepting the settings) and stays that way when one changes back to "blocks" or "bands".*

      We thank the reviewer for pointing out these bugs. These bugs are now corrected in the revised version.

      * The fact that every time one clicks on the most used buttons, the getDirectory window appears is not only quite annoying but also, ultimately a waste of time. Isn't it possible to choose the directory in which to store the files only once, from the "Set parameters" window?*

      We have now found a solution to avoid this step. The user is only prompted to provide the image folder when pressing the "Set parameter" button. We kept the prompt for directory only when the user selects the time-lapse analysis or the analysis of multiple ROIs. The main reason is that it is very easy for the analysis to end up in the wrong folder otherwise.

      * The authors state that the outputs of the workflow are "user friendly text files". However, some of them lack descriptive headers (like the localisations and profiles) or even file names (like colors.txt). If there is something lacking in the manuscript, it is a brief description of all the output files generated during the workflow.*

      PatternJ generates multiple files, several of which are internal to the toolset. They are needed to keep track of which analyses were done, and which colors were used in the images, amongst others. From the user part, only the files obtained after the analysis All_localizations.channel_X.txt and sarcomere_lengths.txt are useful. To improve the user experience, we now moved all internal files to a folder named "internal", which we think will clarify which outputs are useful for further analysis, and which ones are not. We thank the reviewer for raising this point and we now mention it in our Tutorial.

      I don't really see the point in saving the localizations from the "Extraction" step, they are even named "temp".

      We thank the reviewer for this comment, this was indeed not necessary. We modified PatternJ to delete these files after they are used.

      * In the same line, I DO see the point of saving the profiles and localizations from the "Extract & Save" step, but I think they should be deleted during the "Analysis" step, since all their information is then grouped in a single file, with descriptive headers. This deleting could be optional and set in the "Set parameters" window.*

      We understand the point raised by the reviewer. However, the analysis depends on the reference channel picked, which is asked for when starting an analysis, and can be augmented with additional selections. If a user chooses to modify the reference channel or to add a new profile to the analysis, deleting all these files would mean that the user will have to start over again, which we believe will create frustration. An optional deletion at the analysis step is simple to implement, but it could create problems for users who do not understand what it means practically.

      * Moreover, I think it would be useful to also save the linear roi used for the "Extract & Save" step, and eventually combine them during the "Analysis step" into a single roi set file so that future re-analysis could be made on the same regions. This could be an optional feature set from the "Set parameters" window. *

      We agree with the reviewer that saving ROIs is very useful. ROIs are now saved into a single file each time the user extracts and saves positions from a selection. Additionally, the user can re-use previous ROIs and analyze an image or image series in a single step.

      * In the "PatternJ workflow" section of the manuscript, the authors state that after the "Extract & Save" step "(...) steps 1, 2, 4, and 5 can be repeated on other selections (...)". However, technically, only steps 1 and 5 are really necessary (alternatively 1, 4 and 5 if the user is unsure of the quality of the patterning). If a user follows this to the letter, I think it can lead to wasted time.

      *

      We agree with the reviewer and have corrected the manuscript accordingly (line 119-120).

      • *

      *I believe that the "Version Information" button, although important, has potential to be more useful if used as a "Help" button for the toolset. There could be links to useful sources like the manuscript or the PatternJ website but also some tips like "whenever possible, use a higher linewidth for your line selection" *

      We agree with the reviewer as pointed out in our previous answers to the other reviewers. This button is now replaced by a Help menu, including a simple tutorial in a series of images detailing the steps to follow, a link to the user website, and a link to our video tutorial.

      * It would be interesting to mention to what extent does the orientation of the line selection in relation to the patterned structure (i.e. perfectly parallel vs more diagonal) affect pattern length variability?*

      As answered to reviewer 1, we understand this concern, which needs to be clarified for readers. The issue may be concerning at first sight, but the errors grow only with the inverse of cosine and are therefore rather low. For example, if the user creates a selection off by 3 degrees, which is visually obvious, lengths will be affected by an increase of only 0.14%. The point raised by the reviewer is important to discuss, and we therefore have added a comment on the choice of selection (lines 94-98) as well as a supplementary figure (Figure 1 - figure supplement 1).

      * When "the algorithm uses the peak of highest intensity as a starting point and then searches for peak intensity values one spatial period away on each side of this starting point" (line 133-135), does that search have a range? If so, what is the range? *

      We agree that this information is useful to share with the reader. The range is one pattern size. We have modified the sentence to clarify the range of search used and the resulting limits in aperiodicity (now lines 176-181).

      * Line 144 states that the parameters of the fit are saved and given to the user, yet I could not find such information in the outputs. *

      The parameters of the fits are saved for blocks. We have now clarified this point by modifying the manuscript (lines 186-198) and modifying Figure 1 - figure supplement 5. We realized we made an error in the description of how edges of "block with middle band" are extracted. This is now corrected.

      * In line 286, authors finish by saying "More complex patterns from electron microscopy images may also be used with PatternJ.". Since this statement is not backed by evidence in the manuscript, I suggest deleting it (or at the very least, providing some examples of what more complex patterns the authors refer to). *

      This sentence is now deleted.

      * In the TEM image of the fly wing muscle in fig. 4 there is a subtle but clearly visible white stripe pattern in the original image. Since that pattern consists of 'dips', rather than 'peaks' in the profile of the inverted image, they do not get analyzed. I think it is worth mentioning that if the image of interest contains both "bright" and "dark" patterns, then the analysis should be performed in both the original and the inverted images because the nature of the algorithm does not allow it to detect "dark" patterns. *

      We agree with the reviewer's comment. We now mention this point in lines 337-339.

      * In line 283, the authors mention using background correction. They should explicit what method of background correction they used. If they used ImageJ's "subtract background' tool, then specify the radius.*

      We now describe this step in the method section.

      *

      Reviewer #3 (Significance (Required)):

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. Being a software paper, the advance proposed by the authors is technical in nature. The novelty and significance of this tool is that it offers quick and simple pattern analysis at the single unit level to a broad audience, since it runs on the ImageJ GUI and does not require any programming knowledge. Moreover, all the modules and steps are well described in the paper, which allows easy going through the analysis.
      • Place the work in the context of the existing literature (provide references, where appropriate). The authors themselves provide a good and thorough comparison of their tool with other existing ones, both in terms of ease of use and on the type of information extracted by each method. While PatternJ is not necessarily superior in all aspects, it succeeds at providing precise single pattern unit measurements in a user-friendly manner.
      • State what audience might be interested in and influenced by the reported findings. Most researchers working with microscopy images of muscle cells or fibers or any other patterned sample and interested in analyzing changes in that pattern in response to perturbations, time, development, etc. could use this tool to obtain useful, and otherwise laborious, information. *

      We thank the reviewer for these enthusiastic comments about how straightforward for biologists it is to use PatternJ and its broad applicability in the bio community.