Nabin K. Malakar, Ph.D.

NASA JPL
I am a computational physicist working on societal applications of machine-learning techniques.

Research Links

My research interests span multi-disciplinary fields involving Societal applications of Machine Learning, Decision-theoretic approach to automated Experimental Design, Bayesian statistical data analysis and signal processing.

Linkedin


Interested about the picture? Autonomous experimental design allows us to answer the question of where to take the measurements. More about it is here...

Hobbies

I addition to the research, I also like to hike, bike, read and play with water color.

Thanks for the visit. Please feel free to visit my Weblogs.

Welcome to nabinkm.com. Please visit again.

Showing posts with label research. Show all posts
Showing posts with label research. Show all posts

Thursday, January 23, 2014

Assessing Surface PM2.5 Estimates Using Data Fusion of Active and Passive Remote Sensing Methods

In this paper, we focus on estimations of fine particulate matter by combining MODIS satellite Aerosol Optical Depth (AOD) with Weather Research Forecast (WRF) PBL information using a neural network approach and assess its performance. As part of our analysis, we first explore the baseline effectiveness of AOD and PBL as relevant factors in estimating PM2.5 in passive radiometer and active LIDAR data at CCNY and demonstrate that the PBL height is the most critical additional parameter for accurate PM2.5. Furthermore, active measurements from both ground and satellite based lidar are used to show that summer WRF model PBL heights are most accurate. We then expand our analysis to a regional domain where daily estimations are obtained and compared with operational GEOS-CHEM PM2.5 product. Using our approach, we also create regional daily PM2.5 maps and compare against GEOS-CHEM outputs. Finally, we also consider additional improvements, where multiple satellite observations are used as regressors to predict PM2.5. These results illustrate the significant improvement we obtain within this framework in comparison to a “one size fits all continental scale approach”.
PM2.5 estimation for NY and surrounding states for a particular day.
Published in British Journal of Environment and Climate Change, ISSN: 2231–4784 ,Vol.: 3, Issue.: 4 (October-December)-Special Issue
See full article at: http://www.sciencedomain.org/abstract.php?iid=323&id=10&aid=2530

Saturday, January 18, 2014

Survey On The Estimation Of Mutual Information Methods as a Measure of Dependency Versus Correlation Analysis

Link:
http://arxiv.org/abs/1401.3358

In this survey, we present and compare different approaches to estimate Mutual Information (MI) from data to analyze general dependencies between variables of interest in a system. We demonstrate the performance difference of MI versus correlation analysis, which is only optimal in case of linear dependencies. First, we use a piece-wise constant Bayesian methodology using a general Dirichlet prior. In this estimation method, we use a two-stage approach where we approximate the probability distribution first and then calculate the marginal and joint entropies. Here, we demonstrate the performance of this Bayesian approach versus the others for computing the dependency between different variables. We also compare these with linear correlation analysis. Finally, we apply MI and correlation analysis to the identification of the bias in the determination of the aerosol optical depth (AOD) by the satellite based Moderate Resolution Imaging Spectroradiometer (MODIS) and the ground based AErosol RObotic NETwork (AERONET). Here, we observe that the AOD measurements by these two instruments might be different for the same location. The reason of this bias is explored by quantifying the dependencies between the bias and 15 other variables including cloud cover, surface reflectivity and others.

And related:
Towards Identification of Relevant Variables in the observed Aerosol Optical Depth Bias between MODIS and AERONET observations

http://arxiv.org/abs/1302.2969


Estimation and bias correction of aerosol abundance using data-driven machine learning and remote sensing
http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6382197&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6382197
-->

Wednesday, January 1, 2014

Presentations for 94th American Meteorological Society Annual Meeting Atlanta, GA


Monday, 3 February 2014: 11:15 AM

Regional estimates of ground level Aerosol using Satellite Remote Sensing and Machine-Learning
Room C204 (The Georgia World Congress Center )
Nabin Malakar, City College of New York, New York, NY; and A. Atia, B. Gross, F. Moshary, S. Ahmed, and D. Lary
The ground-level aerosols are known to have harmful impact on people's health. The Moderate Imaging resolution Spectroradiometer (MODIS) sensors onboard aqua and terra satellites retrieve aerosol optical depth (AOD) at various bands. The comparison between the AOD measured from the satellite MODIS instruments and the ground-based Aerosol Robotic Network (AERONET) system at 550 nm shows that there is a bias between the two data products. In this study we explore the factors that can delineate these extrema, and/or explain them statistically. We use the MODIS 3 km and 10 km resolution AOD products, and develop a machine-learning framework to compare the Aqua and Terra MODIS-retrieved AODs with the ground- based AERONET observations. The analysis uses several measured variables such as the MODIS AOD, surface type, land use, etc. as input in order to train a neural network in regression mode with a special emphasis on biases observed over non vegetative urban surfaces. The result is the estimator of the bias-corrected estimates of AOD. This research is part of our goal to provide air quality information, with special focus on the northeast region of the USA, which can also be useful for developing regional-level decision support tools.

Tuesday, 4 February 2014: 4:00 PM
A Regional NN estimator of PM2.5 using satellite AOD and WRF meteorology measurements
Room C206 (The Georgia World Congress Center )
Lina Cordero, City College of New York, New York, NY; and N. Malakar, D. Vidal, R. Latto, B. Gross, F. Moshary, and S. Ahmed
Besides affecting the global energy balance, aerosols can have a significant health impact. In particular, extended exposure ultrafine particles is a major concern and regulations by the EPA are constituted to deal with this issue. Unfortunately, measuring surface aerosols over wide areas is costly and difficult so the potential of using satellite remote sensing and/or models becomes an important area of study. In this presentation, we explore the potential of combining meteorological data together with column integrated AOD within a Neural Network approach. To begin, the study is isolated to New York City where accurate AERONET AOD as well as Lidar derived PBL heights along with weather station meteorology is included. The main result of this isolated study illustrates that beyond AOD, the next important factor is the PBL height. This result motivates an extended study where MODIS mosaic AOD's are combined with WRF weather forecast model inputs including PBL height. To use WRF PBL, a matchup between WRF and Calipso is given for single layer cases illustrating strong correlations in spring and summer when PM25 is most important. In particular, we find that with seasonal training, we are able to generally improve on the existing approach utilized by the IDEA (Infusing satellite Data into Environmental air quality Applications) product which utilizes MODIS AOD and GEOS-CHEM PM25/AOD factors. In addition, we explore potential improvements that can occur if we can filter aloft plumes from the processing stream using the NAAPS air forecast model as well as the use of EOF's to fill missing gaps in the AOD spatial imagery.

Thursday, 6 February 2014: 9:00 AM
Use of NN based approaches to create high resolution climate meteorological forecasts
Room C101 (The Georgia World Congress Center )
Nabin Malakar, City College, New York, NY; and B. Gross, J. E. Gonzalez, P. Yang, and F. Moshary
The effects of global climate forecasts on regional scale domains requires that the low resolution GCM forecast data can be intelligently modified so that it can be injected into high resolution models such as terrestrial ecosystems etc. This is often called downscaling in the climate forecast literature and is usually performed using one of 2 different strategies. In the first strategy, the use of purely statistical approaches such as interpolation is applied to the GCM low resolution data to provide the high resolution data. Of course, the “high” resolution data really does not possess any high resolution inputs that can drive regional scale models. In particular, valuable high resolution information such as land surface identification and potential emission sources is not used. On the other hand, the potential of using regional Meteorological Models such as WRF can be attempted where the GCM conditions and the forecasted land surface properties are encoded into a future time slice. Of course, this approach is extremely computer intensive and the performance may not be worth the computer resources. In this presentation, we make use of another intermediate approach where low resolution meteorological data including both surface and column integrated parameters are combined with high resolution land surface classification parameters within a NN training scheme in an attempt to improve on purely interpolative approaches. In particular, our study region is the North East domain [{35N,45N} x {-85W,-65W}] . In particular, we focus on High and Low temperature extremes which are the outputs to be considered are obtained within the PRISM data set while the low resolution climatology parameters at low resolution (.5 deg) MET data including Tmax, Tmin, Rhum, Wind Speed, Radiation, Precip and Planetary Boundary Layer height are obtained from the ISI-MIP climatology forecast database. In addition, a high resolution land surface map is used based on the 2006 USGS land surface map. Preliminary results show that the NN approach can result in improved high resolution performance in areas where land surface features change rapidly. In addition, we will make comparisons using the WRF model for the time periods from 2006-2011.

-->

Tuesday, December 10, 2013

Heading to San Francisco For #AGU13

Bias Correction of high resolution MODIS Aerosol Optical Depth in urban areas using the Dragon AERONET Network
Nabin K Malakar, Adam Atia, Barry Gross, Fred Moshary, Samir A Ahmed,
The City College of New York, New York, NY, United States.
and
David J Lary
University of Texas at Dallas, Richardson, TX, United States.

Abstract
Aerosol optical depth (AOD) is widely used parameter used to quantify aerosol abundance. Satellite retrievals of aerosols over land is fundamentally more complex than aerosol retrieval over oceans. Due to wide coverage and the extensive validation the Moderate Resolution Imaging Spectroradiometer (MODIS), on board the Terra and Aqua satellites are the workhorse instrument used to retrieve AOD from space. However, satellite algorithms of AOD are extremely complex and depends strongly on sun/view geometry, spectral surface albedo, aerosol model assumptions and surface heterogeneity. This issue becomes even more severe when considering the new MODIS 3 km aerosol retrieval products within version 6. To assess satellite retrievals of these high resolution 3 km products, we use the summer 2011 Dragon AERONET data to assess accuracy as well as major retrieval bias that can occur in MODIS measurements.

In this study, we explore in detail the factors that can drive these biases statistically. As discussed above, our considers multiple conditions such as surface reflectivity at various wavelengths, solar and sensor zenith angles, the solar and sensor azimuth, scattering angles as well as meteorological factors and aerosol type (angstrom coefficient) etc which are used inputs are used to train neural network in regression mode to compensate for biases against the Dragon AERONET AOD values.

In particular, we confirm the results of previous studies where the land cover (urban fraction) appears to be a strong factor in AOD bias and develop a NN estimator which includes land cover directly. The algorithm will be tested not only in the Baltimore/Washington area but assessed in the general North East US where urban biases in the NYC area have been previously identified.

Saturday, November 9, 2013

EMEP Poster Presentation


INJECTION OF METEOROLOGICAL FACTORS INTO SATELLITE ESTIMATES OF SURFACE PM2.5
Nabin Malakar
in collaboration with
Lina Cordero, Yonghua Wu, Barry Gross, Fred Moshary, Mike Ku

Abstract
Prior efforts to connect surface PM2.5 to satellite retrieval of aerosol optical depth (AOD) have been mainly made based on statistical approaches connecting AIRNow PM2.5 measurements and satellite AOD for different seasons and geographic regions. However, this approach does not account for complex aerosol behavior including planetary boundary layer (PBL) dynamics. In another approach used operationally within the IDEA (Infusing satellite Data into Environmental air quality Applications) product, the use of a global model (GEOS-CHEM) is used to estimate on a daily basis, the spatial relationship between forecast PM2.5and column path AOD, which can then be used with satellite AOD estimates. However, one difficulty with the GEOS-CHEM approach is the poor spatial resolution symptomatic of global models with a spatial resolution of 2.5 degrees, which fails to particularly resolve issues in the urban/nonurban interface To improve on this, the WRF/CMAQ model is a high-resolution algorithm that accounts for physically based meteorological factors and surface boundary conditions including emission inventories to estimate particulate concentrations and vertical distributions; therefore, it is considered in our work.
Because of the complexity observed in the PM2.5-AOD relationship, our focal point is the application of a neural network for better describing the non-linear conditions surrounding the PM2.5-AOD environment while at the same time investigating other dependences such as additional factors or seasonal changes. Neural networks have proven to perform well in different areas of study, including atmospheric sciences where many complex relationships cannot be sufficiently understood by using statistical approaches. As part of our analysis, we first explore the baseline effectiveness of AOD and PBL as strong factors in estimating PM2.5 in a local experiment using data collected at one site in New York City. Then, we expand our analysis to a regional domain where daily estimations are compared based on site location and season. In our local test, we find very good agreement of the neural network estimator when AOD, PBL, and seasonality are ingested (R~0.94 in summer). Next, we test our regional network and compare it with the GEOS-CHEM product. In particular, we find significant improvement of the NN approach with better correlation and less bias in comparison with GEOS-CHEM. Further, we show that further improvements are obtained if additional satellite information, including satellite/view geometry and land surface reflection, is included. Finally, comparisons with WRF/CMAQ PM25 are included. 

Presenting Poster

%%%%%

About 2013 EMEP Conference
Environmental Monitoring, Evaluation and Protection in New York: Linking Science and Policy

Holiday Inn - Wolf Road
205 Wolf Road, Albany, New York
November 6 & 7

This conference brings together policy makers and nationally renowned scientists to share information on environmental research initiatives in New York State.
Production and use of energy impose one of the greatest burdens on our environment of any human activity. The Environmental Monitoring, Evaluation, and Protection (EMEP) Program at NYSERDA provides scientifically credible and objective information on environmental impacts of energy systems to assist the state in developing science-based and cost-effective policies to mitigate impacts. The EMEP program supports policy-relevant research in order to enhance understanding of energy-related environmental issues.

Thursday, July 26, 2012

Statistical Physics of Human Mobility: Paper

Statistical physics help understand relating the microscopic properties of atoms and molecules to the macroscopic properties of materials that can be observed in everyday life. As a result, it is able to explain thermodynamics as a natural result of statistics, classical mechanics, and quantum mechanics at the microscopic level. [1]

By looking into the GPS information, from vehicles (collected) in Italy, Gallotti et al have performed a study to apply ideas of statistical physics to describe the properties of human mobility.

The human mobility is an interesting research question. Understanding of human mobility can be useful in urban planning, and to understand spread of epidemic. In addition, the authors suggest that such studies may also be useful to discover possible "laws" that can be related to the dynamical cognitive features of individuals.

The average speed variance (on the left), the distribution (on the right) can be decomposed as a mixture of Gaussian. Two Gaussians with mean speed of around 20 Km/hr and 45 Km/hr emerges. This indicates the distinct behavior of drivers. I find this to be an interesting decomposition.

The left figure shows the statistical distribution of the activity time. The presence of straight line indicates Benford's law. Figure on the right shows "total activity time". With the help of the "down time" i.e. the period for which the GPS is turned off, the authors suggest that at least three distinct peaks for full-time (~9 hrs), part-time (~4 hrs) jobs and night rest (~13 hrs). However, there is also one more peak around 1hr downtime. I guess the down-time for one hour peak shows short-term activities such as shopping behavior.

In the paper, using the travel time as a cost function, the authors show that the distribution between successive trips are indeed driven by an underlying Benford's law. The ranking of the the distribution of the average visitaion frequency may also help to understand how people organize their daily agenda. An interesting feature comes out when the average speed distribution for the recorded trip is decomposed as a mixture of two Gaussians: one with ≤ 5km. I think such characteristics distribution indicate the local constraint on the movements. Obviously, the motion is not free of constraints. The mobility data is strictly constrained by the road structures.
It would be interesting to see if there are such statistical phenomena as "phase transition" in such statistical law of human mobility.
This is an interesting paper. See [2].


At last, Why do we move from one place to another?
If we assume some aggregate effect on social scale; are we different than the gas molecules contained in a box? Moreover, it seems someone has to drive an extra mile since the system demands it!

References:
(Special thanks to Prof. Armando Bazzani for allowing me to use the figures.)
[1] http://en.wikipedia.org/wiki/Statistical_physics
[2] Towards a Statistical Physics of Human Mobility
Riccardo Gallotti, Armando Bazzani, Sandro Rambaldi
http://arxiv.org/abs/1207.5698