Author: Kshitij Shrestha
Abstract
Background: Acrylamide, a processing contaminant formed in starchy foods during high-temperature cooking, is classified as a probable human carcinogen. This study conducts a quantitative risk assessment of acrylamide exposure specifically from French fry consumption.
Methods: A one-dimensional Monte Carlo simulation (n = 100,001) was implemented in R. Acrylamide concentration (µg/kg) was modeled using a Weibull distribution (shape = 1.81, scale = 311.50), and French fry consumption (kg/kg bw/week) was modeled using a Logistic distribution (location = 0.00735, scale = 0.00342). Daily exposure (µg/kg bw/day) was calculated. Risk was characterized using the Margin of Exposure (MOE) approach with a BMDL₁₀ of 0.17 mg/kg bw/day (170 µg/kg bw/day) for tumorigenic effects. Two mitigation scenarios were evaluated: a 10% reduction (Scenario 1) and a 90% reduction (Scenario 2) in acrylamide concentration.
Results: The baseline exposure estimate showed a median of 0.224 µg/kg bw/day. The MOE analysis revealed that 13.01% of the population had an MOE > 10,000 (low concern), while 59.13% had an MOE < 1,000, indicating a potential public health concern. Scenario 1 showed minimal improvement, with 13.28% of the population reaching MOE > 10,000 and 55.83% remaining with MOE < 1,000. Scenario 2 demonstrated substantial risk reduction, with 40.87% of the population achieving MOE > 10,000 and 0.35% having MOE < 1,000.
Conclusion: Current acrylamide exposure from French fries poses a potential health concern for nearly a quarter of the population. While a 10% reduction in acrylamide levels is ineffective, a 90% reduction can protect half of the population from potential risk, highlighting the importance of effective mitigation strategies like citric acid treatment during processing.
1. Introduction
Acrylamide formation in thermally processed foods, particularly potato products like French fries, represents a significant food safety challenge. Since its discovery in 2002, extensive research has confirmed acrylamide's neurotoxic and carcinogenic properties. The Margin of Exposure (MOE) approach, endorsed by the European Food Safety Authority (EFSA) and other regulatory bodies, provides a framework for assessing the risk of such genotoxic carcinogens.
This case study employs probabilistic modeling to quantify acrylamide exposure from French fries and evaluate the potential effectiveness of two mitigation strategies. Unlike deterministic methods, this approach captures population variability in both consumption patterns and contaminant concentrations, providing a more realistic risk characterization.
2. Materials and Methods
2.1. Exposure Modeling
A one-dimensional Monte Carlo simulation was conducted using the mc2d package in R with 100,001 iterations. The model incorporated two key stochastic variables:
Acrylamide concentration (µg/kg) :Weibull distribution (shape = 1.806815, scale = 311.503027)
Consumption frequency (kg/kg bw/week): Logistic distribution (location = 0.007351559, scale = 0.003424008)
The above distributions for acrylamide concentration and french fries consumption were selected after fitting different continuous distributions to the available consumption and concentration data and comparing the goodness of fit statistics to those distribution.
Daily exposure was calculated as:
Exposure = (Concentration × Consumption) / 7
All negative values generated from the distributions were truncated to zero to maintain biological plausibility.
2.2. Risk Characterization
Risk characterization employed the Margin of Exposure (MOE) approach:
MOE = BMDL₁₀ / Estimated Daily Exposure
where BMDL₁₀ = 170 µg/kg bw/day (0.17 mg/kg bw/day) for tumorigenic effects.
Population risk was categorized as:
MOE > 10,000: Low health concern
MOE < 1,000: Potential health concern
MOE < 100: High health concern
2.3. Mitigation Scenarios
Two intervention scenarios were modeled:
Scenario 1: 10% reduction in acrylamide concentration (heating under 175°C)
Scenario 2: 90% reduction in acrylamide concentration (heating with citric acid)
2.4. Software
All analyses and simulations were performed using the R statistical software (Version 4.3.1; R Foundation for Statistical Computing). All the R script and their output is attatched below.
3. Results
3.1. Baseline Exposure Assessment
The baseline model results are summarized in Table 1. The median acrylamide exposure was 0.224 µg/kg bw/day, with significant right-skewing evident in the distribution (97.5th percentile = 1.1 µg/kg bw/day).
Table 1. Baseline Exposure and MOE Distribution
| Percentile | Exposure (µg/kg bw/day) | MOE |
|---|---|---|
| 25% | 0.082 | 2073 |
| 50% | 0.224 | 759 |
| 75% | 0.434 | 392 |
| 97.5% | 1.1 | 155 |
3.2. Risk Characterization
The population risk assessment revealed critical findings:
13.01% of the population had MOE > 10,000
59.13% had MOE < 1,000 (potential concern)
0.35% had MOE < 100 (high concern)
3.3. Effectiveness of Mitigation Strategies
Table 2. Impact of Mitigation Scenarios on Population Risk
| Scenario | % Population MOE > 10,000 | % Population MOE < 1,000 | % Population MOE < 100 |
|---|---|---|---|
| Baseline | 13.01% | 59.13% | 0.35% |
| 10% Reduction | 13.28% | 55.83% | 0.19% |
| 90% Reduction | 40.87% | 0.35% | 0.00% |
Scenario 1 (10% reduction) showed negligible improvement in population risk. In contrast, Scenario 2 (90% reduction) demonstrated substantial benefit, moving around half of the population into the "low concern" category (MOE > 10,000) and almost eliminating exposures that would result in MOE values below 1,000.
4. Discussion
The results clearly demonstrate that current exposure levels to acrylamide from French fries present a potential public health concern, with more than half of the population having MOE values below the threshold of 1,000. This finding aligns with concerns raised by regulatory agencies worldwide regarding dietary acrylamide exposure.
The differential effectiveness of the two mitigation scenarios provides crucial insights for risk management. The minimal impact of Scenario 1 suggests that small reductions in acrylamide concentration are insufficient to meaningfully alter population risk. This has important implications for the food industry, indicating that incremental process adjustments may not achieve desired public health outcomes.
Conversely, the dramatic improvement seen in Scenario 2 underscores the potential of effective mitigation technologies. The 90% reduction scenario, potentially achievable through methods such as citric acid pretreatment or enzyme treatment, would protect half of the population from potential risk. This represents a significant public health achievement and highlights the importance of developing and implementing robust mitigation strategies.
Limitations: This assessment focused exclusively on French fries as an exposure source, while total dietary acrylamide exposure includes contributions from multiple food categories. Additionally, the model assumed uniform application and effectiveness of mitigation strategies, which may not reflect real-world conditions.
5. Conclusion
This probabilistic risk assessment confirms that acrylamide exposure from French fries represents a meaningful public health concern under current conditions. The analysis demonstrates that while modest reductions in acrylamide levels provide negligible benefit, substantial reductions (90%) can effectively protect a significant portion of the population. These findings support the need for targeted research and development of effective mitigation technologies and provide a quantitative basis for regulatory decision-making regarding acrylamide levels in processed potato products.
References
Tareke, E., et al. (2002). Analysis of acrylamide, a carcinogen formed in heated foodstuffs. Journal of Agricultural and Food Chemistry, 50(17), 4998-5006.
EFSA Panel on Contaminants in the Food Chain (CONTAM). (2015). Scientific opinion on acrylamide in food. EFSA Journal, 13(6), 4104.
Pedreschi, F., et al. (2008). Acrylamide reduction in potato chips by using citric acid. Journal of Food Science, 73(5), E197-E201.
Acrylamide Case Study
Kshitij Shrestha
2025-10-28
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.
# please save this r script/markdown file and the concentration data and consumption data in the same folder
library(rstudioapi)
# Set the working directory to the directory containing the script
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
#Step 1: # Import concentration and consumption dataset
library(readxl)
Acrylamide_concentration_data <- read_excel("Acrylamide concentration data.xlsx")
fries_consumption_data <- read_excel("fries consumption data.xlsx")
#####################
#Fitting the Acrylamide concentration data
######################
#Step 2: #Specify as the "Concentrationdata" to the data to fit different distribution from data_to_fit
Concentrationdata <-Acrylamide_concentration_data$`AA concentrations mcgperkg`
summary(Concentrationdata)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 56.0 140.0 254.0 276.1 356.0 729.0
#Step 3: #Load differnt relevent libraries
library("fitdistrplus")
## Loading required package: MASS
## Loading required package: survival
library("actuar")
##
## Attaching package: 'actuar'
## The following objects are masked from 'package:stats':
##
## sd, var
## The following object is masked from 'package:grDevices':
##
## cm
library("stats")
library("MASS")
library("survival")
library("triangle")
library("mc2d")
## Loading required package: mvtnorm
##
## Attaching package: 'mc2d'
## The following objects are masked from 'package:base':
##
## pmax, pmin
#Always maximize the window of plot till the top before running the program
#If the window is small, graph wont fit in that size and will give error
#A. FITTING CONTINUOUS DISTRIBUTION is FROM step 4 to 10
#B. FITTING DISCREATE DISTRIBUTION IS FROM STEP 11
# LETS START WITH CONTINUOUS DISTRIBUTION FITTING
# A. FITTING CONTINUOUS DISTRIBUTION
#Step 4: # skewness-kurtosis plot of the data
par(mfrow=c(1,1)) # make 1 graph in one page
descdist(Concentrationdata, boot = 1000)
## summary statistics
## ------
## min: 56 max: 729
## median: 254
## mean: 276.0909
## estimated sd: 163.1005
## estimated skewness: 0.7936859
## estimated kurtosis: 3.377937
plotdist(Concentrationdata)
#Step 5:# FITTING DISTRIBUTION
# fit one distribution at a time and check graphs and fitting (summary statistics are done at later stage)
#1. Fitting Triangle distribution
# the minimum, maximum and mode value of the observations were min=56, max=729, mode=98
ft=fitdist(Concentrationdata, "triang", start = list(min=56, max=729, mode=98))
## Warning in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, : Some
## parameter names have no starting/fixed value but have a default value: mean.
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in sqrt(diag(varcovar)): NaNs produced
# IF error, please check if there is infinity in the the actual log-likelihoods
do.call(dtriang,c(list(x=Concentrationdata,log=TRUE),list(min=56, max=729, mode=98)))
## [1] -Inf -7.610358 -7.071361 -5.818598 -5.818598 -5.832964 -5.837799
## [8] -5.875663 -5.887478 -5.909796 -5.948739 -6.031493 -6.073542 -6.075589
## [15] -6.085887 -6.098388 -6.102589 -6.111046 -6.126019 -6.130339 -6.197478
## [22] -6.202119 -6.230423 -6.235219 -6.344326 -6.532563 -6.568811 -6.682408
## [29] -6.686174 -6.886007 -6.923570 -7.095420 -Inf
# change the starting condition based on summary statistics
plot(ft) # histogram, QQ, CDF and PP plot at once
#2. fitting normal distribution
fn <-fitdist(Concentrationdata,"norm")
plot(fn) # histogram, QQ, CDF and PP plot at once
#3. Fitting weibull distribution
fw <- fitdist(Concentrationdata, "weibull")
plot(fw) # histogram, QQ, CDF and PP plot at once
#4. Fitting lognormal distribution
fln <- fitdist(Concentrationdata, "lnorm")
plot(fln) # histogram, QQ, CDF and PP plot at once
#5. Fitting gamma distribution
fg <- fitdist(Concentrationdata, "gamma")
plot(fg) # histogram, QQ, CDF and PP plot at once
#6. Fitting Loglogistic distribution
fll <- fitdist(Concentrationdata, "llogis")
plot(fll) # histogram, QQ, CDF and PP plot at once
#7. Fitting exponential distribution
fe <-fitdist(Concentrationdata,"exp")
plot(fe) # histogram, QQ, CDF and PP plot at once
#8. Fitting pareto distribution
fp <- fitdist(Concentrationdata, "pareto", start = list(shape = 1, scale = 500))
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
plot(fp) # histogram, QQ, CDF and PP plot at once
#11. Fitting uniform distribution
fu <-fitdist(Concentrationdata,"unif")
plot(fu) # histogram, QQ, CDF and PP plot at once
#12. Fitting logistic distribution
fl <-fitdist(Concentrationdata,"logis")
plot(fl) # histogram, QQ, CDF and PP plot at once
#STEP 6: GRAPHICAL COMPARISON OF DIFFERENT CONTINUOUS FITTING DISTRIBUTION
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#A.comparison of density plot of different fitted distribution
denscomp(ft, legendtext = "triangle")
denscomp(fn, legendtext = "normal")
denscomp(fw, legendtext = "weibull")
denscomp(fln, legendtext = "lognormal")
denscomp(fg, legendtext = "gamma")
denscomp(fll, legendtext = "loglogistic")
denscomp(fe, legendtext = "exponential")
denscomp(fp, legendtext = "pareto")
denscomp(fu, legendtext = "uniform")
denscomp(fl, legendtext = "logistic")
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#B. Comparison of cdf plot comparions of different fiited distribution
cdfcomp(ft, legendtext = "triangle") # Cumulative Distribution plot
cdfcomp(fn, legendtext = "normal") # Cumulative Distribution plot
cdfcomp(fw, legendtext = "weibull") # Cumulative Distribution plot
cdfcomp(fln, legendtext = "lognormal") # Cumulative Distribution plot
cdfcomp(fg, legendtext = "gamma") # Cumulative Distribution plot
cdfcomp(fll, legendtext = "loglogistic") # Cumulative Distribution plot
cdfcomp(fe, legendtext = "exponential") # Cumulative Distribution plot
cdfcomp(fp, legendtext = "pareto") # Cumulative Distribution plot
cdfcomp(fu, legendtext = "uniform") # Cumulative Distribution plot
cdfcomp(fl, legendtext = "logistic") # Cumulative Distribution plot
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#C.Comparison of QQ plot comparions of different fiited distribution
qqcomp(ft, legendtext = "triangle") #QQplot
qqcomp(fn, legendtext = "normal")
qqcomp(fw, legendtext = "weibull")
qqcomp(fln, legendtext = "lognormal")
qqcomp(fg, legendtext = "gamma")
qqcomp(fll, legendtext = "loglogistic")
qqcomp(fe, legendtext = "exponential")
qqcomp(fp, legendtext = "pareto")
qqcomp(fu, legendtext = "uniform")
qqcomp(fl, legendtext = "logistic")
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#D. Comparison of PP plot comparions of different fiited distribution
ppcomp(ft, legendtext = "triangle") #PP plot
ppcomp(fn, legendtext = "normal")
ppcomp(fw, legendtext = "weibull")
ppcomp(fln, legendtext = "lognormal")
ppcomp(fg, legendtext = "gamma")
ppcomp(fll, legendtext = "loglogistic")
ppcomp(fe, legendtext = "exponential")
ppcomp(fp, legendtext = "pareto")
ppcomp(fu, legendtext = "uniform")
ppcomp(fl, legendtext = "logistic")
par(mfrow=c(1,1)) # make graph 1 by 1 in one page (1 in one page)
#Step 7: comparison of summary statistics of different fit
# comparison of goodness of fit with loglikleyhood, AIC and BIC values
# choose the distribution with low AIC and BIC values (best fit)
summary(ft) #fitted traingle distribution
## Fitting of the distribution ' triang ' by maximum likelihood
## Parameters :
## estimate Std. Error
## min -16.9 37.24010
## max 777.6 47.02371
## mode 146.6 NaN
## Loglikelihood: -212.169 AIC: 430.338 BIC: 434.8275
## Correlation matrix:
## min max mode
## min 1.000000 -0.175488 NaN
## max -0.175488 1.000000 NaN
## mode NaN NaN 1
summary(fn) #fitted normal distribution
## Fitting of the distribution ' norm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## mean 276.0909 27.95878
## sd 160.6103 19.76976
## Loglikelihood: -214.4313 AIC: 432.8627 BIC: 435.8557
## Correlation matrix:
## mean sd
## mean 1.000000e+00 2.025087e-06
## sd 2.025087e-06 1.000000e+00
summary(fw) #fitted weibull distribution
## Fitting of the distribution ' weibull ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 1.806815 0.2445039
## scale 311.503027 31.6813979
## Loglikelihood: -211.2729 AIC: 426.5458 BIC: 429.5388
## Correlation matrix:
## shape scale
## shape 1.00000 0.32029
## scale 0.32029 1.00000
summary(fln) #fitted lognormal distribution
## Fitting of the distribution ' lnorm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## meanlog 5.428584 0.11446423
## sdlog 0.657547 0.08093759
## Loglikelihood: -212.1333 AIC: 428.2667 BIC: 431.2597
## Correlation matrix:
## meanlog sdlog
## meanlog 1 0
## sdlog 0 1
summary(fg) #fitted gamma distribution
## Fitting of the distribution ' gamma ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 2.756572945 0.621322135
## rate 0.009984886 0.002443173
## Loglikelihood: -211.2098 AIC: 426.4196 BIC: 429.4126
## Correlation matrix:
## shape rate
## shape 1.0000000 0.9055933
## rate 0.9055933 1.0000000
summary(fll) #fitted loglogistic distribution
## Fitting of the distribution ' llogis ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 2.61607 0.3775947
## scale 238.12784 27.8446554
## Loglikelihood: -212.8607 AIC: 429.7214 BIC: 432.7144
## Correlation matrix:
## shape scale
## shape 1.00000000 0.06713573
## scale 0.06713573 1.00000000
summary(fe) #fitted exponential distribution
## Fitting of the distribution ' exp ' by maximum likelihood
## Parameters :
## estimate Std. Error
## rate 0.003621995 0.0005772947
## Loglikelihood: -218.4841 AIC: 438.9682 BIC: 440.4647
summary(fp) #fitted pareto distribution
## Fitting of the distribution ' pareto ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 4695538 0
## scale 1296381246 0
## Loglikelihood: -218.4841 AIC: 440.9682 BIC: 443.9612
## Correlation matrix:
## shape scale
## shape 1 NaN
## scale NaN 1
summary(fu) #fitted uniform distribution
## Fitting of the distribution ' unif ' by maximum likelihood
## Parameters :
## estimate
## min 56
## max 729
## Loglikelihood: -214.8876 AIC: 433.7752 BIC: 436.7682
summary(fl) #fitted logistic distribution
## Fitting of the distribution ' logis ' by maximum likelihood
## Parameters :
## estimate Std. Error
## location 262.92350 27.76334
## scale 91.21987 13.20561
## Loglikelihood: -214.6138 AIC: 433.2276 BIC: 436.2207
## Correlation matrix:
## location scale
## location 1.00000000 0.06409305
## scale 0.06409305 1.00000000
#Step 8: comparions of continuous fitted distribution with goodness of fit statistics
#please remove the fit inside the list below, if that one could not have been fitted above
gofstat(list(ft,fn,fw,fln,fg,fll,fe,fp,fu,fl))
## Goodness-of-fit statistics
## 1-mle-triang 2-mle-norm 3-mle-weibull 4-mle-lnorm
## Kolmogorov-Smirnov statistic 0.1578342 0.1363943 0.10071430 0.1679134
## Cramer-von Mises statistic 0.1326371 0.1041954 0.05314437 0.1058692
## Anderson-Darling statistic 0.7412460 0.6393422 0.31540856 0.5822591
## 5-mle-gamma 6-mle-llogis 7-mle-exp 8-mle-pareto
## Kolmogorov-Smirnov statistic 0.1272613 0.14148524 0.2171099 0.2171138
## Cramer-von Mises statistic 0.0619509 0.08449744 0.4622041 0.4622202
## Anderson-Darling statistic 0.3562905 0.55047346 2.5811868 2.5812503
## 9-mle-unif 10-mle-logis
## Kolmogorov-Smirnov statistic 0.3454005 0.09489029
## Cramer-von Mises statistic 1.2633074 0.07235833
## Anderson-Darling statistic Inf 0.55695712
##
## Goodness-of-fit criteria
## 1-mle-triang 2-mle-norm 3-mle-weibull
## Akaike's Information Criterion 430.3380 432.8627 426.5458
## Bayesian Information Criterion 434.8275 435.8557 429.5388
## 4-mle-lnorm 5-mle-gamma 6-mle-llogis 7-mle-exp
## Akaike's Information Criterion 428.2667 426.4196 429.7214 438.9682
## Bayesian Information Criterion 431.2597 429.4126 432.7144 440.4647
## 8-mle-pareto 9-mle-unif 10-mle-logis
## Akaike's Information Criterion 440.9682 433.7752 433.2276
## Bayesian Information Criterion 443.9612 436.7682 436.2207
#compare GOF and select the best three fitted distributions
#based on GOF and graph, both gamma and weibull distribution fits well over whole range.
#For our example, weibull distribution was chosed
#Step 10: based on GOF and graph select the best fitted continuous distribution
# Note the estimated parameters with the standard error from summary statistics
# standard error can be used in 2D montecarlo simulation as uncertinities
#Step 11: Uncertinities in the estimated parameters for the best selected distribution can also be estimated by bootstrapping
#So the selected distribution for acrylamide concentration data is weibul distribution with the error in the parameters as follows:
uncertinity <-bootdist(fw,niter = 1001)
summary(uncertinity)
## Parametric bootstrap medians and 95% percentile CI
## Median 2.5% 97.5%
## shape 1.858237 1.458437 2.513762
## scale 310.359322 251.745487 371.033455
plot(uncertinity)
#######################
#Fitting of distribution of the consumption data
##################
#Step 2: #Specify as the "consumptiondata" to the data to fit different distribution
Consumptiondata <-fries_consumption_data$`weekly consumption of fried French fries kgperkgbwperweek`
summary(Consumptiondata)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000000 0.003750 0.006933 0.007734 0.011767 0.025050
#Step 3: #Load differnt relevent libraries
library("fitdistrplus")
library("actuar")
library("stats")
library("MASS")
library("survival")
library("triangle")
library("mc2d")
#Always maximize the window of plot till the top before running the program
#If the window is small, graph wont fit in that size and will give error
#A. FITTING CONTINUOUS DISTRIBUTION is FROM step 4 to 10
#B. FITTING DISCREATE DISTRIBUTION IS FROM STEP 11
# LETS START WITH CONTINUOUS DISTRIBUTION FITTING
# A. FITTING CONTINUOUS DISTRIBUTION
#Step 4: # skewness-kurtosis plot of the data
par(mfrow=c(1,1)) # make 1 graph in one page
descdist(Consumptiondata, boot = 1000)
## summary statistics
## ------
## min: 0 max: 0.02505
## median: 0.006933333
## mean: 0.007734496
## estimated sd: 0.006050041
## estimated skewness: 0.7031494
## estimated kurtosis: 3.32809
plotdist(Consumptiondata)
#Step 5:# FITTING DISTRIBUTION
# fit one distribution at a time and check graphs and fitting (summary statistics are done at later stage)
#1. Fitting Triangle distributio
ft=fitdist(Consumptiondata, "triang", start = list(min=0, max=0.02505, mode=0.0001))
## Warning in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, : Some
## parameter names have no starting/fixed value but have a default value: mean.
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in sqrt(diag(varcovar)): NaNs produced
# IF error, please check if there is infinity in the the actual log-likelihoods
do.call(dtriang,c(list(x=Consumptiondata,log=TRUE),list(min=0, max=0.02505, mode=0.0001)))
## [1] -Inf -Inf -Inf -Inf -Inf -Inf -Inf -Inf
## [9] 4.241235 4.241235 4.241235 4.202106 4.202106 4.202106 4.181952 4.181952
## [17] 4.181952 4.161384 4.160552 4.157219 4.063660 4.059987 3.989482 3.979561
## [25] 3.969541 3.965505 3.954320 3.943009 3.885546 3.817570 3.769541 3.757165
## [33] 3.742108 3.699494 3.688883 3.684875 3.624167 3.487175 3.469068 3.283418
## [41] 3.238421 2.590271 -Inf
# change the starting condition based on summary statistics
plot(ft) # histogram, QQ, CDF and PP plot at once
#2. fitting normal distribution
fn <-fitdist(Consumptiondata,"norm")
plot(fn) # histogram, QQ, CDF and PP plot at once
#5. Fitting gamma distribution
fg <- fitdist(Consumptiondata, "gamma")
plot(fg) # histogram, QQ, CDF and PP plot at once
#7. Fitting exponential distribution
fe <-fitdist(Consumptiondata,"exp")
plot(fe) # histogram, QQ, CDF and PP plot at once
#8. Fitting pareto distribution
fp <- fitdist(Consumptiondata, "pareto", start = list(shape = 1, scale = 500))
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in cov2cor(varcovar): NaNs produced
plot(fp) # histogram, QQ, CDF and PP plot at once
#11. Fitting uniform distribution
fu <-fitdist(Consumptiondata,"unif")
plot(fu) # histogram, QQ, CDF and PP plot at once
#12. Fitting logistic distribution
fl <-fitdist(Consumptiondata,"logis")
plot(fl) # histogram, QQ, CDF and PP plot at once
#STEP 6: GRAPHICAL COMPARISON OF DIFFERENT CONTINUOUS FITTING DISTRIBUTION
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#A.comparison of density plot of different fitted distribution
denscomp(ft, legendtext = "triangle")
denscomp(fn, legendtext = "normal")
denscomp(fg, legendtext = "gamma")
denscomp(fe, legendtext = "exponential")
denscomp(fp, legendtext = "pareto")
denscomp(fu, legendtext = "uniform")
denscomp(fl, legendtext = "logistic")
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#B. Comparison of cdf plot comparions of different fiited distribution
cdfcomp(ft, legendtext = "triangle") # Cumulative Distribution plot
cdfcomp(fn, legendtext = "normal") # Cumulative Distribution plot
cdfcomp(fg, legendtext = "gamma") # Cumulative Distribution plot
cdfcomp(fe, legendtext = "exponential") # Cumulative Distribution plot
cdfcomp(fp, legendtext = "pareto") # Cumulative Distribution plot
cdfcomp(fu, legendtext = "uniform") # Cumulative Distribution plot
cdfcomp(fl, legendtext = "logistic") # Cumulative Distribution plot
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#C.Comparison of QQ plot comparions of different fiited distribution
qqcomp(ft, legendtext = "triangle") #QQplot
qqcomp(fn, legendtext = "normal")
qqcomp(fg, legendtext = "gamma")
qqcomp(fe, legendtext = "exponential")
qqcomp(fp, legendtext = "pareto")
qqcomp(fu, legendtext = "uniform")
qqcomp(fl, legendtext = "logistic")
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)
#D. Comparison of PP plot comparions of different fiited distribution
ppcomp(ft, legendtext = "triangle") #PP plot
ppcomp(fn, legendtext = "normal")
ppcomp(fg, legendtext = "gamma")
ppcomp(fe, legendtext = "exponential")
ppcomp(fp, legendtext = "pareto")
ppcomp(fu, legendtext = "uniform")
ppcomp(fl, legendtext = "logistic")
par(mfrow=c(1,1)) # make graph 1 by 1 in one page (1 in one page)
#Step 7: comparison of summary statistics of different fit
# comparison of goodness of fit with loglikleyhood, AIC and BIC values
# choose the distribution with low AIC and BIC values (best fit)
summary(ft) #fitted traingle distribution
## Fitting of the distribution ' triang ' by maximum likelihood
## Parameters :
## estimate Std. Error
## min -0.002505 6.481336e-04
## max 0.026720 6.581590e-09
## mode 0.001770 NaN
## Loglikelihood: 161.605 AIC: -317.21 BIC: -311.9264
## Correlation matrix:
## min max mode
## min 1.000000e+00 -6.515768e-07 NaN
## max -6.515768e-07 1.000000e+00 NaN
## mode NaN NaN 1
summary(fn) #fitted normal distribution
## Fitting of the distribution ' norm ' by maximum likelihood
## Parameters :
## estimate Std. Error
## mean 0.007734496 0.0009118313
## sd 0.005979278 0.0005653711
## Loglikelihood: 159.1222 AIC: -314.2445 BIC: -310.7221
## Correlation matrix:
## mean sd
## mean 1 0
## sd 0 1
summary(fg) #fitted gamma distribution
## Fitting of the distribution ' gamma ' by maximum likelihood
## Parameters :
## estimate
## shape 1.634359
## rate 211.307741
## Loglikelihood: -92341796821 AIC: 184683593646 BIC: 184683593650
summary(fe) #fitted exponential distribution
## Fitting of the distribution ' exp ' by maximum likelihood
## Parameters :
## estimate Std. Error
## rate 129.2909 19.71672
## Loglikelihood: 166.0688 AIC: -330.1376 BIC: -328.3764
summary(fp) #fitted pareto distribution
## Fitting of the distribution ' pareto ' by maximum likelihood
## Parameters :
## estimate Std. Error
## shape 3769291.94 NaN
## scale 29150.28 4576.787
## Loglikelihood: 166.0688 AIC: -328.1376 BIC: -324.6152
## Correlation matrix:
## shape scale
## shape 1 NaN
## scale NaN 1
summary(fu) #fitted uniform distribution
## Fitting of the distribution ' unif ' by maximum likelihood
## Parameters :
## estimate
## min 0.00000
## max 0.02505
## Loglikelihood: 158.5359 AIC: -313.0718 BIC: -309.5494
summary(fl) #fitted logistic distribution
## Fitting of the distribution ' logis ' by maximum likelihood
## Parameters :
## estimate Std. Error
## location 0.007351559 0.0009194191
## scale 0.003424008 0.0003156140
## Loglikelihood: 158.7641 AIC: -313.5282 BIC: -310.0058
## Correlation matrix:
## location scale
## location 1.00000000 0.03737763
## scale 0.03737763 1.00000000
#Step 8: comparions of continuous fitted distribution with goodness of fit statistics
#please remove the fit inside the list below, if that one could not have been fitted above
gofstat(list(ft,fn,fg,fe,fp,fu,fl))
## Goodness-of-fit statistics
## 1-mle-triang 2-mle-norm 3-mle-gamma 4-mle-exp
## Kolmogorov-Smirnov statistic 0.1358209 0.13637510 0.1860465 0.1860465
## Cramer-von Mises statistic 0.1374924 0.09027371 0.1716586 0.3203760
## Anderson-Darling statistic 1.1366162 0.66799939 Inf Inf
## 5-mle-pareto 6-mle-unif 7-mle-logis
## Kolmogorov-Smirnov statistic 0.1860465 0.3342153 0.12492624
## Cramer-von Mises statistic 0.3205255 1.9029459 0.08168584
## Anderson-Darling statistic Inf Inf 0.62219988
##
## Goodness-of-fit criteria
## 1-mle-triang 2-mle-norm 3-mle-gamma 4-mle-exp
## Akaike's Information Criterion -317.2100 -314.2445 184683593646 -330.1376
## Bayesian Information Criterion -311.9264 -310.7221 184683593650 -328.3764
## 5-mle-pareto 6-mle-unif 7-mle-logis
## Akaike's Information Criterion -328.1376 -313.0718 -313.5282
## Bayesian Information Criterion -324.6152 -309.5494 -310.0058
#compare GOF and select the best three fitted distributions
#based on GOF and graph, both logistic and normal distribution fits well over whole range.
#Due to lower AIC value, logistic distribution was chosed
#Step 10: based on GOF and graph select the best fitted continuous distribution
# Note the estimated parameters with the standard error from summary statistics
# standard error can be used in 2D montecarlo simulation as uncertinities
#Step 11: Uncertinities in the estimated parameters for the best selected distribution can also be estimated by bootstrapping
#So the selected distribution for fries consumption data is logistic distribution with the error in the parameters as follows:
uncertinity <-bootdist(fl,niter = 1001)
summary(uncertinity)
## Parametric bootstrap medians and 95% percentile CI
## Median 2.5% 97.5%
## location 0.007338434 0.005696604 0.009093599
## scale 0.003385162 0.002596865 0.004306786
plot(uncertinity)
##################
#Exposure Assessment
###########
library(mc2d)
#one dimensional montecarlo simulation
ndvar(100001)
## [1] 100001
concentration_AA_mcgperkg <-mcstoc(rweibull,shape=1.806815,scale=311.503027)
concentration_AA_mcgperkg[concentration_AA_mcgperkg<0] <-0
consumption_fries_kgperkgbwperweek <-mcstoc(rlogis, location= 0.007351559, scale= 0.003424008)
consumption_fries_kgperkgbwperweek[consumption_fries_kgperkgbwperweek<0] <- 0
#mcg/kgbw/day
exposure <-concentration_AA_mcgperkg*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp1 <-mc(concentration_AA_mcgperkg,consumption_fries_kgperkgbwperweek, exposure)
print(Exp1)
## node mode nsv nsu nva variate min
## 1 concentration_AA_mcgperkg numeric 100001 1 1 1 0.542
## 2 consumption_fries_kgperkgbwperweek numeric 100001 1 1 1 0.000
## 3 exposure numeric 100001 1 1 1 0.000
## mean median max Nas type outm
## 1 2.77e+02 2.53e+02 1.17e+03 0 V each
## 2 7.73e-03 7.34e-03 5.06e-02 0 V each
## 3 3.05e-01 2.24e-01 3.90e+00 0 V each
summary(Exp1)
## concentration_AA_mcgperkg :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 277 159 0.542 40.6 156 253 373 644 1174 1e+05 0
##
## consumption_fries_kgperkgbwperweek :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.00773 0.0055 0 0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05 0
##
## exposure :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.305 0.306 0 0 0.0824 0.224 0.434 1.1 3.9 1e+05 0
hist(Exp1)
plot.plotmc(Exp1)
summary(concentration_AA_mcgperkg)
## node :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 277 159 0.542 40.6 156 253 373 644 1174 1e+05 0
summary(consumption_fries_kgperkgbwperweek)
## node :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.00773 0.0055 0 0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05 0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day
# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.
# lets assume,lower BMDL 170 mcg per kg bw per day
quartile<-quantile(exposure, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile<-as.data.frame(quartile)
print(quartile)
## exposure.5. exposure.10. exposure.15. exposure.20. exposure.25.
## NoInc 0 0 0.02917884 0.05598821 0.08238223
## exposure.50. exposure.75. exposure.80. exposure.85. exposure.90.
## NoInc 0.2242505 0.43371 0.4977718 0.5819165 0.6996181
## exposure.95. exposure.97. exposure.100.
## NoInc 0.9007259 1.052152 3.901302
MOE<-170/quartile
print(MOE)
## exposure.5. exposure.10. exposure.15. exposure.20. exposure.25.
## NoInc Inf Inf 5826.14 3036.354 2063.552
## exposure.50. exposure.75. exposure.80. exposure.85. exposure.90.
## NoInc 758.0809 391.967 341.5219 292.1381 242.9897
## exposure.95. exposure.97. exposure.100.
## NoInc 188.7367 161.5736 43.5752
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS
sum(exposure<0.017)/nrow(exposure)*100
## [1] 13.00587
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000
sum(exposure>0.17)/nrow(exposure)*100
## [1] 59.13141
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100
sum(exposure>1.7)/nrow(exposure)*100
## [1] 0.3549965
######
# SCENARIO #1 LETS ASSUME THAT HEATING UNDER 175C REDUCES AA CONCENTRATION BY 10 %
#one dimensional montecarlo simulation
concentration_AA_mcgperkg_1 <-concentration_AA_mcgperkg *0.9
concentration_AA_mcgperkg_1[concentration_AA_mcgperkg_1<0] <-0
#mcg/kgbw/day
exposure_1 <-concentration_AA_mcgperkg_1*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp2 <-mc(concentration_AA_mcgperkg_1,consumption_fries_kgperkgbwperweek, exposure_1)
print(Exp2)
## node mode nsv nsu nva variate min
## 1 concentration_AA_mcgperkg_1 numeric 100001 1 1 1 0.488
## 2 consumption_fries_kgperkgbwperweek numeric 100001 1 1 1 0.000
## 3 exposure_1 numeric 100001 1 1 1 0.000
## mean median max Nas type outm
## 1 2.49e+02 2.28e+02 1.06e+03 0 V each
## 2 7.73e-03 7.34e-03 5.06e-02 0 V each
## 3 2.75e-01 2.02e-01 3.51e+00 0 V each
summary(Exp2)
## concentration_AA_mcgperkg_1 :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 249 143 0.488 36.5 140 228 335 579 1056 1e+05 0
##
## consumption_fries_kgperkgbwperweek :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.00773 0.0055 0 0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05 0
##
## exposure_1 :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.275 0.275 0 0 0.0741 0.202 0.39 0.993 3.51 1e+05 0
hist(Exp2)
plot.plotmc(Exp2)
summary(concentration_AA_mcgperkg_1)
## node :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 249 143 0.488 36.5 140 228 335 579 1056 1e+05 0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day
# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.
# lets assume,lower BMDL 170 mcg per kg bw per day
quartile_1<-quantile(exposure_1, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile_1<-as.data.frame(quartile_1)
print(quartile_1)
## exposure_1.5. exposure_1.10. exposure_1.15. exposure_1.20. exposure_1.25.
## NoInc 0 0 0.02626096 0.05038939 0.07414401
## exposure_1.50. exposure_1.75. exposure_1.80. exposure_1.85.
## NoInc 0.2018254 0.390339 0.4479947 0.5237248
## exposure_1.90. exposure_1.95. exposure_1.97. exposure_1.100.
## NoInc 0.6296563 0.8106533 0.946937 3.511172
MOE_1<-170/quartile_1
print(MOE_1)
## exposure_1.5. exposure_1.10. exposure_1.15. exposure_1.20. exposure_1.25.
## NoInc Inf Inf 6473.489 3373.726 2292.835
## exposure_1.50. exposure_1.75. exposure_1.80. exposure_1.85.
## NoInc 842.3121 435.5188 379.4688 324.5979
## exposure_1.90. exposure_1.95. exposure_1.97. exposure_1.100.
## NoInc 269.9886 209.7074 179.5262 48.41688
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS
sum(exposure_1<0.017)/nrow(exposure_1)*100
## [1] 13.28587
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000
sum(exposure_1>0.17)/nrow(exposure_1)*100
## [1] 55.83244
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100
sum(exposure_1>1.7)/nrow(exposure_1)*100
## [1] 0.1899981
######
# SCENARIO #2 LETS ASSUME THAT HEATING UNDER 175C IN THE PRESENCE OF CITRIC ACID REDUCES AA CONCENTRATION BY 90 %
#one dimensional montecarlo simulation
concentration_AA_mcgperkg_2 <-concentration_AA_mcgperkg *0.1
concentration_AA_mcgperkg_2[concentration_AA_mcgperkg_2<0] <-0
#mcg/kgbw/day
exposure_2 <-concentration_AA_mcgperkg_2*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp3 <-mc(concentration_AA_mcgperkg_2,consumption_fries_kgperkgbwperweek, exposure_2)
print(Exp3)
## node mode nsv nsu nva variate min
## 1 concentration_AA_mcgperkg_2 numeric 100001 1 1 1 0.0542
## 2 consumption_fries_kgperkgbwperweek numeric 100001 1 1 1 0.0000
## 3 exposure_2 numeric 100001 1 1 1 0.0000
## mean median max Nas type outm
## 1 27.66696 25.31538 117.3652 0 V each
## 2 0.00773 0.00734 0.0506 0 V each
## 3 0.03052 0.02243 0.3901 0 V each
summary(Exp3)
## concentration_AA_mcgperkg_2 :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 27.7 15.9 0.0542 4.06 15.6 25.3 37.3 64.4 117 1e+05 0
##
## consumption_fries_kgperkgbwperweek :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.00773 0.0055 0 0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05 0
##
## exposure_2 :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 0.0305 0.0306 0 0 0.00824 0.0224 0.0434 0.11 0.39 1e+05 0
hist(Exp3)
plot.plotmc(Exp3)
summary(concentration_AA_mcgperkg_2)
## node :
## mean sd Min 2.5% 25% 50% 75% 97.5% Max nsv Na's
## NoUnc 27.7 15.9 0.0542 4.06 15.6 25.3 37.3 64.4 117 1e+05 0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day
# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.
# lets assume,lower BMDL 170 mcg per kg bw per day
quartile_2<-quantile(exposure_2, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile_2<-as.data.frame(quartile_2)
print(quartile_2)
## exposure_2.5. exposure_2.10. exposure_2.15. exposure_2.20. exposure_2.25.
## NoInc 0 0 0.002917884 0.005598821 0.008238223
## exposure_2.50. exposure_2.75. exposure_2.80. exposure_2.85.
## NoInc 0.02242505 0.043371 0.04977718 0.05819165
## exposure_2.90. exposure_2.95. exposure_2.97. exposure_2.100.
## NoInc 0.06996181 0.09007259 0.1052152 0.3901302
MOE_2<-170/quartile_2
print(MOE_2)
## exposure_2.5. exposure_2.10. exposure_2.15. exposure_2.20. exposure_2.25.
## NoInc Inf Inf 58261.4 30363.54 20635.52
## exposure_2.50. exposure_2.75. exposure_2.80. exposure_2.85.
## NoInc 7580.809 3919.67 3415.219 2921.381
## exposure_2.90. exposure_2.95. exposure_2.97. exposure_2.100.
## NoInc 2429.897 1887.367 1615.736 435.752
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS
sum(exposure_2<0.017)/nrow(exposure_2)*100
## [1] 40.86859
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000
sum(exposure_2>0.17)/nrow(exposure_2)*100
## [1] 0.3549965
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100
sum(exposure_2>1.7)/nrow(exposure_2)*100
## [1] 0
# summary
# Create a comprehensive comparison table
comparison_table <- data.frame(
Scenario = c("Baseline", "Scenario 1 (10% reduction)", "Scenario 2 (90% reduction)"),
Median_Exposure = c(
median(exposure),
median(exposure_1),
median(exposure_2)
),
P95_Exposure = c(
quantile(exposure, 0.95),
quantile(exposure_1, 0.95),
quantile(exposure_2, 0.95)
),
Percent_Safe = c(
sum(exposure < 0.017) / nrow(exposure) * 100,
sum(exposure_1 < 0.017) / nrow(exposure_1) * 100,
sum(exposure_2 < 0.017) / nrow(exposure_2) * 100
),
Percent_Potential_Concern = c(
sum(exposure > 0.17) / nrow(exposure) * 100,
sum(exposure_1 > 0.17) / nrow(exposure_1) * 100,
sum(exposure_2 > 0.17) / nrow(exposure_2) * 100
),
Percent_High_Concern = c(
sum(exposure > 1.7) / nrow(exposure) * 100,
sum(exposure_1 > 1.7) / nrow(exposure_1) * 100,
sum(exposure_2 > 1.7) / nrow(exposure_2) * 100
)
)
## Warning in data.frame(Scenario = c("Baseline", "Scenario 1 (10% reduction)", :
## row names were found from a short variable and have been discarded
print("Comprehensive Scenario Comparison:")
## [1] "Comprehensive Scenario Comparison:"
print(comparison_table)
## Scenario Median_Exposure P95_Exposure.95.
## 1 Baseline 0.22425048 0.9007259
## 2 Scenario 1 (10% reduction) 0.20182543 0.9007259
## 3 Scenario 2 (90% reduction) 0.02242505 0.9007259
## P95_Exposure.95..1 P95_Exposure.95..2 Percent_Safe Percent_Potential_Concern
## 1 0.8106533 0.09007259 13.00587 59.1314087
## 2 0.8106533 0.09007259 13.28587 55.8324417
## 3 0.8106533 0.09007259 40.86859 0.3549965
## Percent_High_Concern
## 1 0.3549965
## 2 0.1899981
## 3 0.0000000
library(ggplot2)
library(gridExtra)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:gridExtra':
##
## combine
## The following object is masked from 'package:MASS':
##
## select
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
# Prepare data for plotting
exposure_data <- data.frame(
Exposure = c(exposure, exposure_1, exposure_2),
Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"),
each = length(exposure))
)
# Create density plot
p1 <- ggplot(exposure_data, aes(x = Exposure, fill = Scenario)) +
geom_density(alpha = 0.6) +
scale_x_continuous(trans = 'log10', limits = c(1e-4, 10)) +
geom_vline(xintercept = 0.017, linetype = "dashed", color = "green", size = 1) +
geom_vline(xintercept = 0.17, linetype = "dashed", color = "orange", size = 1) +
geom_vline(xintercept = 1.7, linetype = "dashed", color = "red", size = 1) +
labs(title = "Exposure Distribution Across Scenarios",
x = "Daily Exposure (μg/kg bw/day, log scale)",
y = "Density") +
theme_minimal() +
annotate("text", x = 0.01, y = 1, label = "MOE > 10,000", color = "green", size = 3) +
annotate("text", x = 0.1, y = 1, label = "MOE < 1,000", color = "orange", size = 3) +
annotate("text", x = 2, y = 1, label = "MOE < 100", color = "red", size = 3)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(p1)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_density()`).
# Calculate MOE for all scenarios
MOE_baseline <- 170 / exposure
MOE_scenario1 <- 170 / exposure_1
MOE_scenario2 <- 170 / exposure_2
moe_data <- data.frame(
MOE = c(MOE_baseline, MOE_scenario1, MOE_scenario2),
Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"),
each = length(MOE_baseline))
)
# Remove infinite values for plotting
moe_data <- moe_data[is.finite(moe_data$MOE), ]
p2 <- ggplot(moe_data, aes(x = Scenario, y = MOE, fill = Scenario)) +
geom_boxplot(alpha = 0.7) +
scale_y_continuous(trans = 'log10', limits = c(10, 100000)) +
geom_hline(yintercept = 100, linetype = "dashed", color = "red", size = 1) +
geom_hline(yintercept = 1000, linetype = "dashed", color = "orange", size = 1) +
geom_hline(yintercept = 10000, linetype = "dashed", color = "green", size = 1) +
labs(title = "Margin of Exposure (MOE) Comparison",
y = "MOE (log scale)") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
annotate("text", x = 1.5, y = 50, label = "MOE < 100", color = "red", size = 3) +
annotate("text", x = 1.5, y = 500, label = "MOE < 1,000", color = "orange", size = 3) +
annotate("text", x = 1.5, y = 5000, label = "MOE > 10,000", color = "green", size = 3)
print(p2)
## Warning: Removed 2906 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
# Calculate risk categories
risk_categories <- data.frame(
Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"), each = 3),
Risk_Level = rep(c("Safe (MOE > 10,000)", "Potential Concern (MOE < 1,000)", "High Concern (MOE < 100)"), 3),
Percentage = c(
sum(exposure < 0.017) / nrow(exposure) * 100,
sum(exposure > 0.17) / nrow(exposure) * 100,
sum(exposure > 1.7) / nrow(exposure) * 100,
sum(exposure_1 < 0.017) / nrow(exposure_1) * 100,
sum(exposure_1 > 0.17) / nrow(exposure_1) * 100,
sum(exposure_1 > 1.7) / nrow(exposure_1) * 100,
sum(exposure_2 < 0.017) / nrow(exposure_2) * 100,
sum(exposure_2 > 0.17) / nrow(exposure_2) * 100,
sum(exposure_2 > 1.7) / nrow(exposure_2) * 100
)
)
p3 <- ggplot(risk_categories, aes(x = Scenario, y = Percentage, fill = Risk_Level)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Population Distribution Across Risk Categories",
y = "Percentage of Population (%)",
fill = "Risk Category") +
theme_minimal() +
scale_fill_manual(values = c("Safe (MOE > 10,000)" = "green",
"Potential Concern (MOE < 1,000)" = "orange",
"High Concern (MOE < 100)" = "red")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(p3)
# Create cumulative distribution plot
p4 <- ggplot(exposure_data, aes(x = Exposure, color = Scenario)) +
stat_ecdf(size = 1) +
scale_x_continuous(trans = 'log10', limits = c(1e-4, 10)) +
geom_vline(xintercept = 0.017, linetype = "dashed", color = "green", size = 0.5) +
geom_vline(xintercept = 0.17, linetype = "dashed", color = "orange", size = 0.5) +
geom_vline(xintercept = 1.7, linetype = "dashed", color = "red", size = 0.5) +
labs(title = "Cumulative Distribution of Exposure",
x = "Daily Exposure (μg/kg bw/day, log scale)",
y = "Cumulative Probability") +
theme_minimal()
print(p4)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_ecdf()`).
# Create a comprehensive dashboard
library(patchwork)
##
## Attaching package: 'patchwork'
## The following object is masked from 'package:MASS':
##
## area
# Arrange all plots
combined_plot <- (p1 | p2) / (p3 | p4) +
plot_annotation(title = "Acrylamide Risk Assessment: Scenario Comparison Dashboard",
theme = theme(plot.title = element_text(hjust = 0.5, size = 16)))
print(combined_plot)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_density()`).
## Warning: Removed 2906 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_ecdf()`).
# Print summary table
print("Summary Comparison Table:")
## [1] "Summary Comparison Table:"
print(comparison_table)
## Scenario Median_Exposure P95_Exposure.95.
## 1 Baseline 0.22425048 0.9007259
## 2 Scenario 1 (10% reduction) 0.20182543 0.9007259
## 3 Scenario 2 (90% reduction) 0.02242505 0.9007259
## P95_Exposure.95..1 P95_Exposure.95..2 Percent_Safe Percent_Potential_Concern
## 1 0.8106533 0.09007259 13.00587 59.1314087
## 2 0.8106533 0.09007259 13.28587 55.8324417
## 3 0.8106533 0.09007259 40.86859 0.3549965
## Percent_High_Concern
## 1 0.3549965
## 2 0.1899981
## 3 0.0000000
No comments:
Post a Comment