Dataset related to "Efficiency and reliability in biological neural network architectures"

  • Daniela Egas Santander (Creator)
  • Christoph Pokorny (Creator)
  • András Ecker (Creator)
  • Jānis Lazovskis (Creator)
  • Matteo Santoro (Creator)
  • Jason P. Smith (Nottingham Trent University) (Creator)
  • Kathryn Hess (Creator)
  • Ran Levi (Creator)
  • Michael W. Reimann (Creator)

Dataset

Description

This is an accompanying dataset to the article with the title "Efficiency and reliability in biological neural network architectures" (DOI: 10.1101/2024.03.15.585196). It contains structural and activity data related to the morphologically detailed model of the rat somatosensory cortex (Markram et al., 2015), refered to as "BBP" data in the article. Specificaly, following data items are included: Simulation data: simulation.xz "Reliability" protocol: Separate folders BlobStimReliability_O1v5-SONATA_ with simulation data using the baseline and all manipulated connectomes respectively (see Technical info below), each of which containing: working_dir/connectome.h5: Connectivity matrix in ConnectivityMatrix format, which can be loaded using ConnectomeUtilities. working_dir/raw_spikes_exc_.npy: Raw (excitatory) spikes in numpy .npy format, containing an array of spike times (first column) and corresponding neuron GIDs (second column). One file for each of the 10 simulations with different simulator seeds, i.e., is 0..9. working_dir/stim_stream.npy: Stimulus train in numpy .npy format, containing the sequence of stimulus identities. working_dir/time_windows.npy: Time windows in ms in numpy .npy format, corresponding to the stimulus train. working_dir/processed_data_store.h5: Data store in HDF5 format with preprocessed spike signals (e.g., required for Gaussian kernel reliability computations), which contains... spike_signals_exc: Group of simulations datasets "sim_0" to "sim_9", each of which is an array of size and contains binned spike signals filtered with a Gaussian kernel. sigma: Sigma in ms of Gaussian kernel used for smoothing. gids: List of excitatory neuron GIDs. t_bins: List of time bins in ms. firing_rates: Average firing rates per simulation (0..9; rows) and (excitatory) neuron GID (columns); average firing rates were computed as the inverse of the mean inter-spike interval per neuron. "Classification" protocol: Single folder Toposample_O1v5-SONATA with simulation data using the baseline connectome, stored in a format compatible with the TriDy (Conceição et al., 2022) and TopoSampling (Reimann et al., 2022) pipelines, containing: toposample_input/connectivity.npz: Sparse connectivity matrix in Compressed Sparse Column format , which can be loaded using scipy.sparse.load_npz. toposample_input/neuron_info.pickle: Pandas dataframe in pickle format, which can be loaded using pandas.read_pickle, containing additional information about each neuron. toposample_input/raw_spikes_exc.npy: Raw (excitatory) spikes in numpy .npy format, containing an array of spike times (first column) and corresponding neuron GIDs (second column). toposample_input/stim_stream.npy: Stimulus train in numpy .npy format, containing the sequence of stimulus identities. toposample_input/time_windows.npy: Time windows in ms in numpy .npy format, corresponding to the stimulus train. Classification data: classification.xz Selected neighborhoods and classification results for the "PCA" method (TopoSampling pipeline) as well as the "network_based" method using active subnetworks (TriDy pipeline, using TriDy-tools wrapper), stored as: "PCA" method: PCA/community_database_PCA.pkl: Pandas dataframe in pickle format, containing the binary selection of 50 neighborhood centers (neuron GIDs; rows) for the different selection parameters (columns). PCA/classification_results_PCA.pkl: Pandas dataframe in pickle format, containing the classification accuracies for all selection parameters (rows) and 6 cross-validation folds plus mean (columns). "network_based" method: network_based/selections_reliability.pkl: Pandas dataframe in pickle format, containing different combinations of first/second selection parameters for the double selection procedure (rows) together with neuron indices (w.r.t. the EXC subcircuit) of the corresponding 50 neighborhood centers (chief0..49; columns). network_based/partition_reliability.npy: Partition indices in numpy .npy format required to launch the pipeline using TriDy-tools, which is an array of 50 neuron indices (w.r.t. the full circuit!) of the neighborhood centers (columns) for each combination of selections as in selections_reliability.pkl (rows). network_based/results/...: Subfolder containing a list of pickle files with the classification results using different featurization parameters, as indicated by the filename. Each file contains a Pandas dataframe with the classification accuracies and numbers of (non-zero) features (columns) for each combination of selections as in selections_reliability.pkl (rows). Funding Funding provided by the Swiss government’s ETH Board to the Blue Brain Project, a research center of the École polytechnique fédérale de Lausanne (EPFL).
Date made available10 Apr 2024
PublisherZenodo

Cite this