### Recent Submissions

• Journal Article

#### A graph-based algorithm for detecting rigid domains in protein structures ﻿

BMC Bioinformatics. 2021 Feb 12;22(1):66
Background Conformational transitions are implicated in the biological function of many proteins. Structural changes in proteins can be described approximately as the relative movement of rigid domains against each other. Despite previous efforts, there is a need to develop new domain segmentation algorithms that are capable of analysing the entire structure database efficiently and do not require the choice of protein-dependent tuning parameters such as the number of rigid domains. Results We develop a graph-based method for detecting rigid domains in proteins. Structural information from multiple conformational states is represented by a graph whose nodes correspond to amino acids. Graph clustering algorithms allow us to reduce the graph and run the Viterbi algorithm on the associated line graph to obtain a segmentation of the input structures into rigid domains. In contrast to many alternative methods, our approach does not require knowledge about the number of rigid domains. Moreover, we identified default values for the algorithmic parameters that are suitable for a large number of conformational ensembles. We test our algorithm on examples from the DynDom database and illustrate our method on various challenging systems whose structural transitions have been studied extensively. Conclusions The results strongly suggest that our graph-based algorithm forms a novel framework to characterize structural transitions in proteins via detecting their rigid domains. The web server is available at http://azifi.tz.agrar.uni-goettingen.de/webservice/ .
• Journal Article

#### Bayesian spatial modelling of childhood cancer incidence in Switzerland using exact point data: a nationwide study during 1985–2015 ﻿

International Journal of Health Geographics. 2020 Apr 17;19(1):15
Background The aetiology of most childhood cancers is largely unknown. Spatially varying environmental factors such as traffic-related air pollution, background radiation and agricultural pesticides might contribute to the development of childhood cancer. This study is the first investigation of the spatial disease mapping of childhood cancers using exact geocodes of place of residence. Methods We included 5947 children diagnosed with cancer in Switzerland during 1985–2015 at 0–15 years of age from the Swiss Childhood Cancer Registry. We modelled cancer risk using log-Gaussian Cox processes and indirect standardisation to adjust for age and year of diagnosis. We examined whether the spatial variation of risk can be explained by modelled ambient air concentration of NO2, modelled exposure to background ionising radiation, area-based socio-economic position (SEP), linguistic region, duration in years of general cancer registration in the canton or degree of urbanisation. Results For all childhood cancers combined, the posterior median relative risk (RR), compared to the national level, varied by location from 0.83 to 1.13 (min to max). Corresponding ranges were 0.96 to 1.09 for leukaemia, 0.90 to 1.13 for lymphoma, and 0.82 to 1.23 for central nervous system (CNS) tumours. The covariates considered explained 72% of the observed spatial variation for all cancers, 81% for leukaemia, 82% for lymphoma and 64% for CNS tumours. There was weak evidence of an association of CNS tumour incidence with modelled exposure to background ionising radiation (RR per SD difference 1.17; 0.98–1.40) and with SEP (1.6; 1.00–1.13). Conclusion Of the investigated diagnostic groups, childhood CNS tumours showed the largest spatial variation. The selected covariates only partially explained the observed variation of CNS tumours suggesting that other environmental factors also play a role.
• Journal Article

#### Analyzing Dynamic Change in Social Network Based on Distribution-Free Multivariate Process Control Method ﻿

Computers, Materials & Continua 2019; 60(3) p.1123-1139
Social organizations can be represented by social network because it can mathematically quantify and represent complex interrelated organizational behavior. Exploring the change in dynamic social network is essential for the situation awareness of the corresponding social organization. Social network usually evolves gradually and slightly, which is hard to be noticed. The statistical process control techniques in industry field have been used to distinguish the statistically significant change of social network. But the original method is narrowed due to some limitation on measures. This paper presents a generic framework to address the change detection problem in dynamic social network and introduces a distribution-free multivariate control charts to supervise the changing of social network. Three groups of network parameters are integrated together in order to achieve a comprehensive view of the dynamic tendency. The proposed approaches handle the non-Gaussian data based on categorizing and ranking. Experiments indicate that nonparametric multivariate procedure is promising to be applied to social network analysis.
• Journal Article

#### Asymptotic Distribution and Simultaneous Confidence Bands for Ratios of Quantile Functions ﻿

Electronic Journal of Statistics 2019; 13(2) p.4391-4415
Ratios of medians or other suitable quantiles of two distributions are widely used in medical research to compare treatment and control groups or in economics to compare various economic variables when repeated cross-sectional data are available. Inspired by the so-called growth incidence curves introduced in poverty research, we argue that the ratio of quantile functions is a more appropriate and informative tool to compare two distributions. We present an estimator for the ratio of quantile functions and develop corresponding simultaneous confidence bands, which allow to assess significance of certain features of the quantile functions ratio. Derived simultaneous confidence bands rely on the asymptotic distribution of the quantile functions ratio and do not require re-sampling techniques. The performance of the simultaneous confidence bands is demonstrated in simulations. Analysis of expenditure data from Uganda in years 1999, 2002 and 2005 illustrates the relevance of our approach.
• Journal Article

#### Multiscale change-point segmentation: beyond step functions ﻿

Electronic Journal of Statistics 2019; 13(2) p.3254-3296
Modern multiscale type segmentation methods are known to detect multiple change-points with high statistical accuracy, while allowing for fast computation. Underpinning (minimax) estimation theory has been developed mainly for models that assume the signal as a piecewise constant function. In this paper, for a large collection of multiscale segmentation methods (including various existing procedures), such theory will be extended to certain function classes beyond step functions in a nonparametric regression setting. This extends the interpretation of such methods on the one hand and on the other hand reveals these methods as robust to deviation from piecewise constant functions. Our main finding is the adaptation over nonlinear approximation classes for a universal thresholding, which includes bounded variation functions, and (piecewise) Hölder functions of smoothness order 0<α≤1 as special cases. From this we derive statistical guarantees on feature detection in terms of jumps and modes. Another key finding is that these multiscale segmentation methods perform nearly (up to a log-factor) as well as the oracle piecewise constant segmentation estimator (with known jump locations), and the best piecewise constant approximants of the (unknown) true signal. Theoretical findings are examined by various numerical simulations.
• Journal Article

#### Tests for qualitative features in the random coefficients model ﻿

Electronic Journal of Statistics 2019; 13(2) p.2257-2306
The random coefficients model is an extension of the linear regression model that allows for unobserved heterogeneity in the population by modeling the regression coefficients as random variables. Given data from this model, the statistical challenge is to recover information about the joint density of the random coefficients which is a multivariate and ill-posed problem. Because of the curse of dimensionality and the ill-posedness, nonparametric estimation of the joint density is difficult and suffers from slow convergence rates. Larger features, such as an increase of the density along some direction or a well-accentuated mode can, however, be much easier detected from data by means of statistical tests. In this article, we follow this strategy and construct tests and confidence statements for qualitative features of the joint density, such as increases, decreases and modes. We propose a multiple testing approach based on aggregating single tests which are designed to extract shape information on fixed scales and directions. Using recent tools for Gaussian approximations of multivariate empirical processes, we derive xpressions for the critical value. We apply our method to simulated and real data.
• Journal Article

#### Enhanced genome assembly and a new official gene set for Tribolium castaneum ﻿

BMC Genomics. 2020 Jan 14;21(1):47
Background The red flour beetle Tribolium castaneum has emerged as an important model organism for the study of gene function in development and physiology, for ecological and evolutionary genomics, for pest control and a plethora of other topics. RNA interference (RNAi), transgenesis and genome editing are well established and the resources for genome-wide RNAi screening have become available in this model. All these techniques depend on a high quality genome assembly and precise gene models. However, the first version of the genome assembly was generated by Sanger sequencing, and with a small set of RNA sequence data limiting annotation quality. Results Here, we present an improved genome assembly (Tcas5.2) and an enhanced genome annotation resulting in a new official gene set (OGS3) for Tribolium castaneum, which significantly increase the quality of the genomic resources. By adding large-distance jumping library DNA sequencing to join scaffolds and fill small gaps, the gaps in the genome assembly were reduced and the N50 increased to 4753kbp. The precision of the gene models was enhanced by the use of a large body of RNA-Seq reads of different life history stages and tissue types, leading to the discovery of 1452 novel gene sequences. We also added new features such as alternative splicing, well defined UTRs and microRNA target predictions. For quality control, 399 gene models were evaluated by manual inspection. The current gene set was submitted to Genbank and accepted as a RefSeq genome by NCBI. Conclusions The new genome assembly (Tcas5.2) and the official gene set (OGS3) provide enhanced genomic resources for genetic work in Tribolium castaneum. The much improved information on transcription start sites supports transgenic and gene editing approaches. Further, novel types of information such as splice variants and microRNA target genes open additional possibilities for analysis.
• Journal Article

#### Global exponential stability and existence of periodic solutions of fuzzy wave equations ﻿

Advances in Difference Equations. 2020 Jan 07;2020(1):13
In this paper, the global exponential stability and the existence of periodic solutions of fuzzy wave equations are investigated. By variable substitution the system of partial differential equations (PDEs) is transformed from second order to first order. Some sufficient conditions that ensure the global exponential stability and the existence of periodic solution of the system are obtained by an analysis that uses a suitable Lyapunov functional. In addition, a concrete example is given to show the effectiveness of the results.
• Journal Article

#### The Protein‐Coding Human Genome: Annotating High‐Hanging Fruits ﻿

BioEssays 2019; 41(11): Art. 1900066
The major transcript variants of human protein-coding genes are annotated to a certain degree of accuracy combining manual curation, transcript data, and proteomics evidence. However, there is considerable disagreement on the annotation of about 2000 genes-they can be protein-coding, noncoding, or pseudogenes-and on the annotation of most of the predicted alternative transcripts. Pure transcriptome mapping approaches seem to be limited in discriminating functional expression from noise. These limitations have partially been overcome by dedicated algorithms to detect alternative spliced micro-exons and wobble splice variants. Recently, knowledge about splice mechanism and protein structure are incorporated into an algorithm to predict neighboring homologous exons, often spliced in a mutually exclusive manner. Predicted exons are evaluated by transcript data, structural compatibility, and evolutionary conservation, revealing hundreds of novel coding exons and splice mechanism re-assignments. The emerging human pan-genome is necessitating distinctive annotations incorporating differences between individuals and between populations.
• Journal Article

#### Molecular contribution function in RESOLFT nanoscopy ﻿

Optics Express 2019; 27(15): Art. 21956
The ultimate objective of a microscope of the highest resolution is to map the molecules of interest in the sample. Traditionally, linear imaging systems are characterized by their spatial frequency transfer function, which is given, in real space, by the point spread function (PSF). By extending the concept of the PSF towards the molecular contribution function (MCF), that quantifies the average contribution of a single fluorophore to the image, a straightforward concept for counting fluorophores is obtained. Using reversible saturable optical fluorescence transitions (RESOLFT), fluorophores are effectively activated only in a small, subdiffraction-sized volume before they are read out. During readout the signal exhibits an increased variance due to the stochastic nature of prior activation, which scales quadratically with the brightness of the active fluorophores while the mean of the signal scales only linearly with it. Using a two-state Markov model for the activation, showing comparable behavior to the switching kinetics of the switchable fluorescent protein rsEGFP2, we can approximate quantitatively the MCF of RESOLFT nanoscopy allowing to count the number of fluorophores within a subdiffraction-sized region of the sample. The method is validated on measurements of tubulin structures in Drosophila melagonaster larvae. Modeling and estimation of the MCF is a promising approach to quantitative microscopy.
• Journal Article

#### Improving bimanual interaction with a prosthesis using semi-autonomous control ﻿

Journal of NeuroEngineering and Rehabilitation. 2019 Nov 14;16(1):140
Background The loss of a hand is a traumatic experience that substantially compromises an individual’s capability to interact with his environment. The myoelectric prostheses are state-of-the-art (SoA) functional replacements for the lost limbs. Their overall mechanical design and dexterity have improved over the last few decades, but the users have not been able to fully exploit these advances because of the lack of effective and intuitive control. Bimanual tasks are particularly challenging for an amputee since prosthesis control needs to be coordinated with the movement of the sound limb. So far, the bimanual activities have been often neglected by the prosthetic research community. Methods We present a novel method to prosthesis control, which uses a semi-autonomous approach in order to simplify bimanual interactions. The approach supplements the commercial SoA two-channel myoelectric control with two additional sensors. Two inertial measurement units were attached to the prosthesis and the sound hand to detect the movement of both limbs. Once a bimanual interaction is detected, the system mimics the coordination strategies of able-bodied subjects to automatically adjust the prosthesis wrist rotation (pronation, supination) and grip type (lateral, palmar) to assist the sound hand during a bimanual task. The system has been evaluated in eight able-bodied subjects performing functional uni- and bi-manual tasks using the novel method and SoA two-channel myocontrol. The outcome measures were time to accomplish the task, semi-autonomous system misclassification rate, subjective rating of intuitiveness, and perceived workload (NASA TLX). Results The results demonstrated that the novel control interface substantially outperformed the SoA myoelectric control. While using the semi-autonomous control the time to accomplish the task and the perceived workload decreased for 25 and 27%, respectively, while the subjects rated the system as more intuitive then SoA myocontrol. Conclusions The novel system uses minimal additional hardware (two inertial sensors) and simple processing and it is therefore convenient for practical implementation. By using the proposed control scheme, the prosthesis assists the user’s sound hand in performing bimanual interactions while decreasing cognitive burden.
• Journal Article

#### THE EXPLICIT MORDELL CONJECTURE FOR FAMILIES OF CURVES ﻿

Forum of Mathematics, Sigma 2019; 7: Art. e31
In this article we prove the explicit Mordell Conjecture for large families of curves. In addition, we introduce a method, of easy application, to compute all rational points on curves of quite general shape and increasing genus. The method bases on some explicit and sharp estimates for the height of such rational points, and the bounds are small enough to successfully implement a computer search. As an evidence of the simplicity of its application, we present a variety of explicit examples and explain how to produce many others. In the appendix our method is compared in detail to the classical method of Manin–Demjanenko and the analysis of our explicit examples is carried to conclusion.
• Journal Article

#### Semi-supervised tri-Adaboost algorithm for network intrusion detection ﻿

International Journal of Distributed Sensor Networks 2019; 15(6)
Network intrusion detection is a relatively mature research topic, but one that remains challenging particular as technologies and threat landscape evolve. Here, a semi-supervised tri-Adaboost (STA) algorithm is proposed. In the algorithm, three different Adaboost algorithms are used as the weak classifiers (both for continuous and categorical data), constituting the decision stumps in the tri-training method. In addition, the chi-square method is used to reduce the dimension of feature and improve computational efficiency. We then conduct extensive numerical studies using different training and testing samples in the KDDcup99 dataset and discover the flows demonstrated that (1) high accuracy can be obtained using a training dataset which consists of a small number of labeled and a large number of unlabeled samples. (2) The algorithm proposed is reproducible and consistent over different runs. (3) The proposed algorithm outperforms other existing learning algorithms, even with only a small amount of labeled data in the training phase. (4) The proposed algorithm has a short execution time and a low false positive rate, while providing a desirable detection rate.
• Journal Article

#### Direct characterization of cytoskeletal reorganization during blood platelet spreading ﻿

Progress in Biophysics and Molecular Biology 2019; 144 p.166-176
Blood platelets are the key cellular players in blood clotting and thus of great biomedical importance. While spreading at the site of injury, they reorganize their cytoskeleton within minutes and assume a flat appearance. As platelets possess no nucleus, many standard methods for visualizing cytoskeletal components by means of fluorescence tags fail. Here we employ silicon-rhodamine actin and tubulin probes for imaging these important proteins in a time-resolved manner. We find two distinct timescales for platelet spread area development and for cytoskeletal reorganization, indicating that although cell spreading is most likely associated with actin polymerization at the cell edges, distinct, stress-fiber-like actin structures within the cell, which may be involved in the generation of contractile forces, form on their own timescale. Following microtubule dynamics allows us to distinguish the role of myosin, microtubules and actin during early spreading.
• Journal Article

#### Big data research guided by sociological theory: a triadic dialogue among big data analysis, theory, and predictive models ﻿

The Journal of Chinese Sociology. 2019 Jul 05;6(1):11
Abstract Computational social science has integrated social science theories and methodology with big data analysis. It has opened a number of new topics for big data analysis and enabled qualitative and quantitative sociological research to provide the ground truth for testing the results of data mining. At the same time, threads of evidence obtained by data mining can inform the development of theory and thereby guide the construction of predictive models to infer and explain more phenomena. Using the example of the Internet data of China’s venture capital industry, this paper shows the triadic dialogue among data mining, sociological theory, and predictive models and forms a methodology of big data analysis guided by sociological theories.
• Journal Article

#### An anisotropic interaction model for simulating fingerprints ﻿

Journal of Mathematical Biology
Evidence suggests that both the interaction of so-called Merkel cells and the epidermal stress distribution play an important role in the formation of fingerprint patterns during pregnancy. To model the formation of fingerprint patterns in a biologically meaningful way these patterns have to become stationary. For the creation of synthetic fingerprints it is also very desirable that rescaling the model parameters leads to rescaled distances between the stationary fingerprint ridges. Based on these observations, as well as the model introduced by Kücken and Champod we propose a new model for the formation of fingerprint patterns during pregnancy. In this anisotropic interaction model the interaction forces not only depend on the distance vector between the cells and the model parameters, but additionally on an underlying tensor field, representing a stress field. This dependence on the tensor field leads to complex, anisotropic patterns. We study the resulting stationary patterns both analytically and numerically. In particular, we show that fingerprint patterns can be modeled as stationary solutions by choosing the underlying tensor field appropriately.
• Journal Article

#### The Proximal Alternating Minimization Algorithm for Two-Block Separable Convex Optimization Problems with Linear Constraints ﻿

Journal of Optimization Theory and Applications
The Alternating Minimization Algorithm has been proposed by Paul Tseng to solve convex programming problems with two-block separable linear constraints and objectives, whereby (at least) one of the components of the latter is assumed to be strongly convex. The fact that one of the subproblems to be solved within the iteration process of this method does not usually correspond to the calculation of a proximal operator through a closed formula affects the implementability of the algorithm. In this paper, we allow in each block of the objective a further smooth convex function and propose a proximal version of the algorithm, which is achieved by equipping the algorithm with proximal terms induced by variable metrics. For suitable choices of the latter, the solving of the two subproblems in the iterative scheme can be reduced to the computation of proximal operators. We investigate the convergence of the proposed algorithm in a real Hilbert space setting and illustrate its numerical performances on two applications in image processing and machine learning.
• Journal Article

#### Quantitative Convergence Analysis of Iterated Expansive, Set-Valued Mappings ﻿

Mathematics of Operations Research 2018; 43(4) p.1143-1176
We develop a framework for quantitative convergence analysis of Picard iterations of expansive set-valued fixed point mappings. There are two key components of the analysis. The first is a natural generalization of single-valued averaged mappings to expansive set-valued mappings that characterizes a type of strong calmness of the fixed point mapping. The second component to this analysis is an extension of the well-established notion of metric subregularity—or inverse calmness—of the mapping at fixed points. Convergence of expansive fixed point iterations is proved using these two properties, and quantitative estimates are a natural by-product of the framework. To demonstrate the application of the theory, we prove, for the first time, a number of results showing local linear convergence of nonconvex cyclic projections for inconsistent (and consistent) feasibility problems, local linear convergence of the forward-backward algorithm for structured optimization without convexity, strong or otherwise, and local linear convergence of the Douglas-Rachford algorithm for structured nonconvex minimization. This theory includes earlier approaches for known results, convex and nonconvex, as special cases.
• Journal Article

#### Latency-Sensitive Data Allocation and Workload Consolidation for Cloud Storage ﻿

IEEE Access 2018; 6 p.76098-76110
Customers often suffer from the variability of data access time in (edge) cloud storage service, caused by network congestion, load dynamics, and so on. One ef cient solution to guarantee a reliable latency-sensitive service (e.g., for industrial Internet of Things application) is to issue requests with multiple download/upload sessions which access the required data (replicas) stored in one or more servers, and use the earliest response from those sessions. In order to minimize the total storage costs, how to optimally allocate data in a minimum number of servers without violating latency guarantees remains to be a crucial issue for the cloud provider to deal with. In this paper, we study the latency-sensitive data allocation problem, the latency-sensitive data reallocation problem and the latency-sensitive workload consolidation problem for cloud storage. We model the data access time as a given distribution whose cumulative density function is known, and prove that these three problems are NP-hard. To solve them, we propose an exact integer nonlinear program (INLP) and a Tabu Search-based heuristic. The simulation results reveal that the INLP can always achieve the best performance in terms of lower number of used nodes and higher storage and throughput utilization, but this comes at the expense of much higher running time. The Tabu Searchbased heuristic, on the other hand, can obtain close-to-optimal performance, but in a much lower running time.
• Journal Article

#### Upper and lower bounds for the Bregman divergence ﻿

Journal of Inequalities and Applications. 2019 Jan 08;2019(1):4
Abstract In this paper we study upper and lower bounds on the Bregman divergence Δ F ξ ( y , x ) : = F ( y ) − F ( x ) − ⟨ ξ , y − x ⟩ $\Delta_{\mathcal {F}}^{\xi }(y,x):=\mathcal {F}(y)-\mathcal {F}(x)- \langle \xi , y-x \rangle$ for some convex functional F $\mathcal {F}$ on a normed space X $\mathcal {X}$ , with subgradient ξ ∈ ∂ F ( x ) $\xi \in\partial \mathcal {F}(x)$ . We give a considerably simpler new proof of the inequalities by Xu and Roach for the special case F ( x ) = ∥ x ∥ p $\mathcal {F}(x)= \Vert x \Vert ^{p}$ , p > 1 $p>1$ . The results can be transferred to more general functions as well.