The following poster presents an algorithm designed to detect security exploits (Especially malicious memory access) in source code files. The concept combines several aspects of computer science such as lexical analysis and artificial intelligence and uses those to detect vulnerabilities that might be exploited in the future.
Describing the process for matching e-commerce offers among two catalogues using python open-source scientific libraries (numpy/scipy, pandas and scikit-learn).
MUSE is a second generation instrument installed at the Very Large Telescope (VLT). It is an integral-field spectrograph operating in the visible wavelength range. We observed the Hubble Deep Fields with MUSE, and we will present the data reduction pipeline that we build to process the data and the python libraries that it is composed of (python-cpl, doit, ipython notebook).
We illustrate a high level parallelization of the partial differential equation solution package SfePy (http://sfepy.org), enabled by the code capability to integrate an equation over an arbitrary subdomain. Several processes, each responsible for its own subdomain, then assemble their contributions to the global linear system. For the parallel solution we use the PETSc library via petsc4py.
JIC (John Innes Centre) and TGAC (The Genome Analysis Centre) are establishing automated field phenotyping platforms. To develop high-throughput bioimage analysis pipelines to analyse plant growth data based on high performance computing infrastructures, we are using Scipy to analyse large phenotypic data to generate precise QTLs for linking genes to traits for wheat plants.
Storing scientific data and metadata consistently and accessing it efficiently is an essential part of research and depends crucially on available. NIXPy is Python package that allows reading and writing of scientific data and metadata in the NIX format, a general, versitle and open data model and file format that is base on HDF5.
In this poster we will discuss a python-based toolkit for advanced sampling and analysis in biomolecular simulation providing details on the python scientific libraries used to implement it. We will also show examples of the enhanced sampling provided by one of the workflows of the framework compared to conventional molecular dynamics techniques, for a variety of biomolecular simulation use-cases.
In this work, the interactive multidimensional data exploration tool Glue is extended for use in atmospheric science. This is done by introducing data handlers for geo-referenced and arbitrary multi-dimensional data stored in netCDF format (using Iris and xray, respectively), and by providing mapping capabilities for visualizing geospatial data (using cartopy).
In this project, we are preparing electronic educational materials for the new subject Numerical methods in astrophysics.
HOW TO GET DATA SCIENCE MODELS INTO PRODUCTION ON A BUDGET
GPAW is an open-source software package for various electronic structure simulations. GPAW is implemented as a combination of the Python and C programming languages. We discuss main features of GPAW's implementation and present performance results on various massively parallel supercomputers and on accelerator (GP-GPU, Xeon Phi coprocessor) based systems.
RODRIGUES stands for RATT Online Deconvolved Radio Image Generation Using Esoteric Software. It is a web based radio telescope calibration and imaging simulator. From a technical perspective it is a web based parameterised docker container scheduler with a result set viewer.
Biophysical instrumentation usually saves the output files with numbers and text. Data analysis is an integral process of the experiments leading to a final quantification of the studied phenomena. Working with the data by opening, copy-pasting to a spreadsheet is laborious and not efficient. We use Python for automated data reading, manipulation, non-linear curve fitting and plotting.
Python can be an effective tool for many R&D tasks. However, its use is not as widespread as e.g. Microsoft Excel or MATLAB. Using Python can improve efficiency and reduce costs. However, the adoption of Python requires users to become acquainted with it. This talk will show how Python was introduced within Demcon, and will provide methods to introduce Python within your research group or business
The Large Synoptic Survey Telescope will find more than 10^5 astronomical transients per night - keeping up will be a challenge. Each alert contains only a few datapoints, requiring a Bayesian approach to classification. I'll introduce the active learning techniques we're exploring for optimizing robotic telescope follow-up schedules, and discuss packages used for the underlying Bayesian analysis.
TINA is a tool we developed for the analysis of 3D images of network structures. The original application was to study confocal microscopy images of osteocyte networks, i.e., cells which are embedded within the mineralized bone matrix and are connected to each other by dendritic cell processes. The software package as well as its application to bone structures will be presented.
I'll explain the workflow and the technical choices we used in developing a graduate massive online course that relies entirely on everyone's favorite tools. I'll tell about translating from IPython notebooks to the edX open learning XML format, and tell about a poor man's approach to providing a computational environment to a broad range of participants.
Highly-constrained, large-dimensional, and non-linear optimizations are found at the root of many of today’s forefront problems in statistics, quantitative finance, risk, operations research, business analytics, and other predictive sciences. Tools for optimization, however, have not changed much in the past 40 years -- until very recently. The abundance of parallel computing resources has stimula
To accompany an upcoming O'Reilly book, 'Data-visualisation with Python and Javascript: crafting a dataviz toolchain for the web', this talk will discuss how to extend Python's first-class data-processing stack into the web-browser with Javascripted visualisations.
During my talk I will introduce the Python bindings for the libcloudph++ library, a C++ library of algorithms for representing atmospheric cloud physics in numerical models. I will present the main concept of the bindings using Boost.Python, examples of use of selected library components from Python environment, and an example solution for using the bindings to access libcloudph++ from Fortran.
Glumpy is a python library for scientific visualization that is both fast, scalable and beautiful. Glumpy offers an intuitive interface between numpy and modern OpenGL.
gprMax is a freely-available set of electromagnetic wave simulation tools based on the Finite-Difference Time-Domain (FDTD) numerical method. Initially developed in the mid-1990s it has been widely used, principally to simulate Ground Penetrating Radar (GPR), for applications in engineering and geophysics. The original C-based code has been rewritten using Python, NumPy, and Cython with OpenMP.
Using word2vec word vector representations and t-SNE dimensionality reduction, a bird’s-eye view of one or more text sources can be computed. word2vec and t-SNE map the words so that semantically similar words are close to each other in 2D. This enables users to explore a text source like a geographical map.
Most scientifc Python libraries incorporates C++ libraries. While several semi-automatic solution and tools exist to wrap large C++ libraries (Cython, Boost.Python and SWIG), the process of wrapping is cumbersome and time consumming. AutoWIG relies on the LLVM/Clang technology for parsing C/C++ code and the Mako templating engine for automatically generating Boost.Python wrappers.
As the amount, complexity and resolution of spaceborne synthetic aperture radars (SAR) images has increased a lot for the last decade, their processing might be a headache. We use Python and PyOpenCL to implement SAR wind speed retrieval algorithms (e.g. CMOD5) for Singlecore and Multicore systems, including GPUs to process those images.
Almost every engineer or scientist uses a lab notebook to take notes, scribble first design ideas or make calculations. The IPython notebook enables the combination of text, math, interactive code and graphics in a common document.This makes it an ideal platform as an extension of the traditional lab notebook, when calculations get more extensive or if results need to be stored electronically.
Roughly spherical objects are abundant and affect human lives every day--whether dealing with the surface of the earth or microscopic viruses that cause severe illness in humans. Using spherical Voronoi diagrams, for which algorithms have only recently been proposed, we can gain insight into these objects. I will discuss an open source Python implementation and remaining challenges.
This talk will provide an overview of the basic skills to get started doing data science in Python.
It is important to apply algorithms to real-world situations, as algorithms developed for theoretical problems often do not perform well in practice. In this talk we will discuss why it is important to understand algorithms to successfully put them into practice.
AMUSE uses python to couple many different existing (and new) astrophysical simulation codes with a consistent interface. This is combined with a powerful datamodel and many tools for units, i/o and setting up initial conditions. AMUSE includes dozens of these existing codes and it is used all over the world for scientific research and teaching.
VVV is an ESO VISTA 4m telescope survey of the Galactic Bulge and Disk in YZJHK. Multi-epochs are being obtained in K band over several years. ESO is distributing the different data products, images and catalogues. These data are key to be able to select the optimal EUCLID microlensing fields towards the Galactic Bulge. We will present our new automated procedures to analyse these public data.