Tuesday, October 28, 2025

A Probabilistic Risk Assessment of Acrylamide Exposure from French Fries: A Case Study on Mitigation Scenarios

Author: Kshitij Shrestha

Abstract

Background: Acrylamide, a processing contaminant formed in starchy foods during high-temperature cooking, is classified as a probable human carcinogen. This study conducts a quantitative risk assessment of acrylamide exposure specifically from French fry consumption.

Methods: A one-dimensional Monte Carlo simulation (n = 100,001) was implemented in R. Acrylamide concentration (µg/kg) was modeled using a Weibull distribution (shape = 1.81, scale = 311.50), and French fry consumption (kg/kg bw/week) was modeled using a Logistic distribution (location = 0.00735, scale = 0.00342). Daily exposure (µg/kg bw/day) was calculated. Risk was characterized using the Margin of Exposure (MOE) approach with a BMDL₁₀ of 0.17 mg/kg bw/day (170 µg/kg bw/day) for tumorigenic effects. Two mitigation scenarios were evaluated: a 10% reduction (Scenario 1) and a 90% reduction (Scenario 2) in acrylamide concentration.

Results: The baseline exposure estimate showed a median of 0.224 µg/kg bw/day. The MOE analysis revealed that 13.01% of the population had an MOE > 10,000 (low concern), while 59.13% had an MOE < 1,000, indicating a potential public health concern. Scenario 1 showed minimal improvement, with 13.28% of the population reaching MOE > 10,000 and 55.83% remaining with MOE < 1,000. Scenario 2 demonstrated substantial risk reduction, with 40.87% of the population achieving MOE > 10,000 and 0.35% having MOE < 1,000.

Conclusion: Current acrylamide exposure from French fries poses a potential health concern for nearly a quarter of the population. While a 10% reduction in acrylamide levels is ineffective, a 90% reduction can protect half of the population from potential risk, highlighting the importance of effective mitigation strategies like citric acid treatment during processing.

1. Introduction

Acrylamide formation in thermally processed foods, particularly potato products like French fries, represents a significant food safety challenge. Since its discovery in 2002, extensive research has confirmed acrylamide's neurotoxic and carcinogenic properties. The Margin of Exposure (MOE) approach, endorsed by the European Food Safety Authority (EFSA) and other regulatory bodies, provides a framework for assessing the risk of such genotoxic carcinogens.

This case study employs probabilistic modeling to quantify acrylamide exposure from French fries and evaluate the potential effectiveness of two mitigation strategies. Unlike deterministic methods, this approach captures population variability in both consumption patterns and contaminant concentrations, providing a more realistic risk characterization.

2. Materials and Methods

2.1. Exposure Modeling

A one-dimensional Monte Carlo simulation was conducted using the mc2d package in R with 100,001 iterations. The model incorporated two key stochastic variables:

  • Acrylamide concentration (µg/kg) :Weibull distribution (shape = 1.806815, scale = 311.503027)

  • Consumption frequency (kg/kg bw/week): Logistic distribution (location = 0.007351559, scale = 0.003424008)

The above distributions for acrylamide concentration and french fries consumption were selected after fitting different continuous distributions to the available consumption and concentration data and comparing the goodness of fit statistics to those distribution. 

Daily exposure was calculated as:
Exposure = (Concentration × Consumption) / 7

All negative values generated from the distributions were truncated to zero to maintain biological plausibility.

2.2. Risk Characterization

Risk characterization employed the Margin of Exposure (MOE) approach:
MOE = BMDL₁₀ / Estimated Daily Exposure
where BMDL₁₀ = 170 µg/kg bw/day (0.17 mg/kg bw/day) for tumorigenic effects.

Population risk was categorized as:

  • MOE > 10,000: Low health concern

  • MOE < 1,000: Potential health concern

  • MOE < 100: High health concern

2.3. Mitigation Scenarios

Two intervention scenarios were modeled:

  • Scenario 1: 10% reduction in acrylamide concentration (heating under 175°C)

  • Scenario 2: 90% reduction in acrylamide concentration (heating with citric acid)

2.4. Software

All analyses and simulations were performed using the R statistical software (Version 4.3.1; R Foundation for Statistical Computing). All the R script and their output is attatched below.

3. Results

3.1. Baseline Exposure Assessment

The baseline model results are summarized in Table 1. The median acrylamide exposure was 0.224 µg/kg bw/day, with significant right-skewing evident in the distribution (97.5th percentile = 1.1 µg/kg bw/day).

Table 1. Baseline Exposure and MOE Distribution

PercentileExposure (µg/kg bw/day)MOE
25%0.0822073
50%0.224759
75%0.434392
97.5%1.1155

3.2. Risk Characterization

The population risk assessment revealed critical findings:

  • 13.01% of the population had MOE > 10,000

  • 59.13% had MOE < 1,000 (potential concern)

  • 0.35% had MOE < 100 (high concern)

3.3. Effectiveness of Mitigation Strategies

Table 2. Impact of Mitigation Scenarios on Population Risk

Scenario% Population MOE > 10,000% Population MOE < 1,000% Population MOE < 100
Baseline13.01%59.13%0.35%
10% Reduction13.28%55.83%0.19%
90% Reduction40.87%0.35%0.00%

Scenario 1 (10% reduction) showed negligible improvement in population risk. In contrast, Scenario 2 (90% reduction) demonstrated substantial benefit, moving around half of the population into the "low concern" category (MOE > 10,000) and almost eliminating exposures that would result in MOE values below 1,000.


4. Discussion

The results clearly demonstrate that current exposure levels to acrylamide from French fries present a potential public health concern, with more than half of the population having MOE values below the threshold of 1,000. This finding aligns with concerns raised by regulatory agencies worldwide regarding dietary acrylamide exposure.

The differential effectiveness of the two mitigation scenarios provides crucial insights for risk management. The minimal impact of Scenario 1 suggests that small reductions in acrylamide concentration are insufficient to meaningfully alter population risk. This has important implications for the food industry, indicating that incremental process adjustments may not achieve desired public health outcomes.

Conversely, the dramatic improvement seen in Scenario 2 underscores the potential of effective mitigation technologies. The 90% reduction scenario, potentially achievable through methods such as citric acid pretreatment or enzyme treatment, would protect half of the population from potential risk. This represents a significant public health achievement and highlights the importance of developing and implementing robust mitigation strategies.

Limitations: This assessment focused exclusively on French fries as an exposure source, while total dietary acrylamide exposure includes contributions from multiple food categories. Additionally, the model assumed uniform application and effectiveness of mitigation strategies, which may not reflect real-world conditions.

5. Conclusion

This probabilistic risk assessment confirms that acrylamide exposure from French fries represents a meaningful public health concern under current conditions. The analysis demonstrates that while modest reductions in acrylamide levels provide negligible benefit, substantial reductions (90%) can effectively protect a significant portion of the population. These findings support the need for targeted research and development of effective mitigation technologies and provide a quantitative basis for regulatory decision-making regarding acrylamide levels in processed potato products.


References

  1. Tareke, E., et al. (2002). Analysis of acrylamide, a carcinogen formed in heated foodstuffs. Journal of Agricultural and Food Chemistry, 50(17), 4998-5006.

  2. EFSA Panel on Contaminants in the Food Chain (CONTAM). (2015). Scientific opinion on acrylamide in food. EFSA Journal, 13(6), 4104.

  3. Pedreschi, F., et al. (2008). Acrylamide reduction in potato chips by using citric acid. Journal of Food Science, 73(5), E197-E201.


Acrylamide Case Study

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

# please save this r script/markdown file and the concentration data and consumption data in the same folder
library(rstudioapi)

# Set the working directory to the directory containing the script
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))


#Step 1: # Import concentration and consumption dataset
library(readxl)

Acrylamide_concentration_data <- read_excel("Acrylamide concentration data.xlsx")
fries_consumption_data <- read_excel("fries consumption data.xlsx")

#####################

#Fitting the Acrylamide concentration data

######################

#Step 2: #Specify as the "Concentrationdata" to the data to fit different distribution from data_to_fit 
Concentrationdata <-Acrylamide_concentration_data$`AA concentrations mcgperkg`
summary(Concentrationdata)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    56.0   140.0   254.0   276.1   356.0   729.0
#Step 3: #Load differnt relevent libraries

library("fitdistrplus")
## Loading required package: MASS
## Loading required package: survival
library("actuar")
## 
## Attaching package: 'actuar'
## The following objects are masked from 'package:stats':
## 
##     sd, var
## The following object is masked from 'package:grDevices':
## 
##     cm
library("stats")
library("MASS")
library("survival")
library("triangle")
library("mc2d")
## Loading required package: mvtnorm
## 
## Attaching package: 'mc2d'
## The following objects are masked from 'package:base':
## 
##     pmax, pmin
#Always maximize the window of plot till the top before running the program
#If the window is small, graph wont fit in that size and will give error

#A. FITTING CONTINUOUS DISTRIBUTION is FROM step 4 to 10
#B. FITTING DISCREATE DISTRIBUTION IS FROM STEP 11

# LETS START WITH CONTINUOUS DISTRIBUTION FITTING
# A. FITTING CONTINUOUS DISTRIBUTION

#Step 4: # skewness-kurtosis plot of the data
par(mfrow=c(1,1)) # make 1 graph in one page
descdist(Concentrationdata, boot = 1000)

## summary statistics
## ------
## min:  56   max:  729 
## median:  254 
## mean:  276.0909 
## estimated sd:  163.1005 
## estimated skewness:  0.7936859 
## estimated kurtosis:  3.377937
plotdist(Concentrationdata)

#Step 5:# FITTING DISTRIBUTION
# fit one distribution at a time and check graphs and fitting (summary statistics are done at later stage)

#1. Fitting Triangle distribution
# the minimum, maximum and mode value of the observations were min=56, max=729, mode=98
ft=fitdist(Concentrationdata, "triang", start = list(min=56, max=729, mode=98))
## Warning in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, : Some
## parameter names have no starting/fixed value but have a default value: mean.
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in sqrt(diag(varcovar)): NaNs produced
# IF error, please check if there is infinity in the the actual log-likelihoods
do.call(dtriang,c(list(x=Concentrationdata,log=TRUE),list(min=56, max=729, mode=98)))
##  [1]      -Inf -7.610358 -7.071361 -5.818598 -5.818598 -5.832964 -5.837799
##  [8] -5.875663 -5.887478 -5.909796 -5.948739 -6.031493 -6.073542 -6.075589
## [15] -6.085887 -6.098388 -6.102589 -6.111046 -6.126019 -6.130339 -6.197478
## [22] -6.202119 -6.230423 -6.235219 -6.344326 -6.532563 -6.568811 -6.682408
## [29] -6.686174 -6.886007 -6.923570 -7.095420      -Inf
# change the starting condition based on summary statistics
plot(ft) # histogram, QQ, CDF and PP plot at once

#2. fitting normal distribution
fn <-fitdist(Concentrationdata,"norm")
plot(fn) # histogram, QQ, CDF and PP plot at once

#3. Fitting weibull distribution
fw <- fitdist(Concentrationdata, "weibull")
plot(fw) # histogram, QQ, CDF and PP plot at once

#4. Fitting lognormal distribution
fln <- fitdist(Concentrationdata, "lnorm")
plot(fln) # histogram, QQ, CDF and PP plot at once

#5. Fitting gamma distribution
fg <- fitdist(Concentrationdata, "gamma")
plot(fg) # histogram, QQ, CDF and PP plot at once

#6. Fitting Loglogistic distribution
fll <- fitdist(Concentrationdata, "llogis")
plot(fll) # histogram, QQ, CDF and PP plot at once

#7. Fitting exponential distribution
fe <-fitdist(Concentrationdata,"exp")
plot(fe) # histogram, QQ, CDF and PP plot at once

#8. Fitting pareto distribution
fp <- fitdist(Concentrationdata, "pareto", start = list(shape = 1, scale = 500))
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
plot(fp) # histogram, QQ, CDF and PP plot at once

#11. Fitting uniform distribution 
fu <-fitdist(Concentrationdata,"unif")
plot(fu) # histogram, QQ, CDF and PP plot at once

#12. Fitting logistic distribution
fl <-fitdist(Concentrationdata,"logis")
plot(fl) # histogram, QQ, CDF and PP plot at once

#STEP 6: GRAPHICAL COMPARISON OF DIFFERENT CONTINUOUS FITTING DISTRIBUTION
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#A.comparison of density plot of different fitted distribution
denscomp(ft, legendtext = "triangle")
denscomp(fn, legendtext = "normal")
denscomp(fw, legendtext = "weibull")
denscomp(fln, legendtext = "lognormal")

denscomp(fg, legendtext = "gamma")
denscomp(fll, legendtext = "loglogistic") 
denscomp(fe, legendtext = "exponential") 
denscomp(fp, legendtext = "pareto") 

denscomp(fu, legendtext = "uniform") 
denscomp(fl, legendtext = "logistic") 

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#B. Comparison of cdf plot comparions of different fiited distribution
cdfcomp(ft, legendtext = "triangle") # Cumulative Distribution plot
cdfcomp(fn, legendtext = "normal") # Cumulative Distribution plot
cdfcomp(fw, legendtext = "weibull") # Cumulative Distribution plot
cdfcomp(fln, legendtext = "lognormal") # Cumulative Distribution plot

cdfcomp(fg, legendtext = "gamma") # Cumulative Distribution plot
cdfcomp(fll, legendtext = "loglogistic") # Cumulative Distribution plot
cdfcomp(fe, legendtext = "exponential") # Cumulative Distribution plot
cdfcomp(fp, legendtext = "pareto") # Cumulative Distribution plot

cdfcomp(fu, legendtext = "uniform") # Cumulative Distribution plot
cdfcomp(fl, legendtext = "logistic") # Cumulative Distribution plot

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#C.Comparison of QQ plot comparions of different fiited distribution

qqcomp(ft, legendtext = "triangle") #QQplot
qqcomp(fn, legendtext = "normal")
qqcomp(fw, legendtext = "weibull")
qqcomp(fln, legendtext = "lognormal")

qqcomp(fg, legendtext = "gamma")
qqcomp(fll, legendtext = "loglogistic") 
qqcomp(fe, legendtext = "exponential") 
qqcomp(fp, legendtext = "pareto") 

qqcomp(fu, legendtext = "uniform") 
qqcomp(fl, legendtext = "logistic") 

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#D. Comparison of PP plot comparions of different fiited distribution
ppcomp(ft, legendtext = "triangle") #PP plot
ppcomp(fn, legendtext = "normal")
ppcomp(fw, legendtext = "weibull")
ppcomp(fln, legendtext = "lognormal")

ppcomp(fg, legendtext = "gamma")
ppcomp(fll, legendtext = "loglogistic") 
ppcomp(fe, legendtext = "exponential") 
ppcomp(fp, legendtext = "pareto") 

ppcomp(fu, legendtext = "uniform") 
ppcomp(fl, legendtext = "logistic") 

par(mfrow=c(1,1)) # make graph 1 by 1 in one page (1 in one page)

#Step 7: comparison of summary statistics of different fit
# comparison of goodness of fit with loglikleyhood, AIC and BIC values
# choose the distribution with low AIC and BIC values (best fit)
summary(ft) #fitted traingle distribution
## Fitting of the distribution ' triang ' by maximum likelihood 
## Parameters : 
##      estimate Std. Error
## min     -16.9   37.24010
## max     777.6   47.02371
## mode    146.6        NaN
## Loglikelihood:  -212.169   AIC:  430.338   BIC:  434.8275 
## Correlation matrix:
##            min       max mode
## min   1.000000 -0.175488  NaN
## max  -0.175488  1.000000  NaN
## mode       NaN       NaN    1
summary(fn) #fitted normal distribution
## Fitting of the distribution ' norm ' by maximum likelihood 
## Parameters : 
##      estimate Std. Error
## mean 276.0909   27.95878
## sd   160.6103   19.76976
## Loglikelihood:  -214.4313   AIC:  432.8627   BIC:  435.8557 
## Correlation matrix:
##              mean           sd
## mean 1.000000e+00 2.025087e-06
## sd   2.025087e-06 1.000000e+00
summary(fw) #fitted weibull distribution
## Fitting of the distribution ' weibull ' by maximum likelihood 
## Parameters : 
##         estimate Std. Error
## shape   1.806815  0.2445039
## scale 311.503027 31.6813979
## Loglikelihood:  -211.2729   AIC:  426.5458   BIC:  429.5388 
## Correlation matrix:
##         shape   scale
## shape 1.00000 0.32029
## scale 0.32029 1.00000
summary(fln) #fitted lognormal distribution
## Fitting of the distribution ' lnorm ' by maximum likelihood 
## Parameters : 
##         estimate Std. Error
## meanlog 5.428584 0.11446423
## sdlog   0.657547 0.08093759
## Loglikelihood:  -212.1333   AIC:  428.2667   BIC:  431.2597 
## Correlation matrix:
##         meanlog sdlog
## meanlog       1     0
## sdlog         0     1
summary(fg) #fitted gamma distribution
## Fitting of the distribution ' gamma ' by maximum likelihood 
## Parameters : 
##          estimate  Std. Error
## shape 2.756572945 0.621322135
## rate  0.009984886 0.002443173
## Loglikelihood:  -211.2098   AIC:  426.4196   BIC:  429.4126 
## Correlation matrix:
##           shape      rate
## shape 1.0000000 0.9055933
## rate  0.9055933 1.0000000
summary(fll) #fitted loglogistic distribution
## Fitting of the distribution ' llogis ' by maximum likelihood 
## Parameters : 
##        estimate Std. Error
## shape   2.61607  0.3775947
## scale 238.12784 27.8446554
## Loglikelihood:  -212.8607   AIC:  429.7214   BIC:  432.7144 
## Correlation matrix:
##            shape      scale
## shape 1.00000000 0.06713573
## scale 0.06713573 1.00000000
summary(fe) #fitted exponential distribution
## Fitting of the distribution ' exp ' by maximum likelihood 
## Parameters : 
##         estimate   Std. Error
## rate 0.003621995 0.0005772947
## Loglikelihood:  -218.4841   AIC:  438.9682   BIC:  440.4647
summary(fp) #fitted pareto distribution
## Fitting of the distribution ' pareto ' by maximum likelihood 
## Parameters : 
##         estimate Std. Error
## shape    4695538          0
## scale 1296381246          0
## Loglikelihood:  -218.4841   AIC:  440.9682   BIC:  443.9612 
## Correlation matrix:
##       shape scale
## shape     1   NaN
## scale   NaN     1
summary(fu) #fitted uniform distribution
## Fitting of the distribution ' unif ' by maximum likelihood 
## Parameters : 
##     estimate
## min       56
## max      729
## Loglikelihood:  -214.8876   AIC:  433.7752   BIC:  436.7682
summary(fl) #fitted logistic distribution
## Fitting of the distribution ' logis ' by maximum likelihood 
## Parameters : 
##           estimate Std. Error
## location 262.92350   27.76334
## scale     91.21987   13.20561
## Loglikelihood:  -214.6138   AIC:  433.2276   BIC:  436.2207 
## Correlation matrix:
##            location      scale
## location 1.00000000 0.06409305
## scale    0.06409305 1.00000000
#Step 8: comparions of continuous fitted distribution with goodness of fit statistics
#please remove the fit inside the list below, if that one could not have been fitted above
gofstat(list(ft,fn,fw,fln,fg,fll,fe,fp,fu,fl))
## Goodness-of-fit statistics
##                              1-mle-triang 2-mle-norm 3-mle-weibull 4-mle-lnorm
## Kolmogorov-Smirnov statistic    0.1578342  0.1363943    0.10071430   0.1679134
## Cramer-von Mises statistic      0.1326371  0.1041954    0.05314437   0.1058692
## Anderson-Darling statistic      0.7412460  0.6393422    0.31540856   0.5822591
##                              5-mle-gamma 6-mle-llogis 7-mle-exp 8-mle-pareto
## Kolmogorov-Smirnov statistic   0.1272613   0.14148524 0.2171099    0.2171138
## Cramer-von Mises statistic     0.0619509   0.08449744 0.4622041    0.4622202
## Anderson-Darling statistic     0.3562905   0.55047346 2.5811868    2.5812503
##                              9-mle-unif 10-mle-logis
## Kolmogorov-Smirnov statistic  0.3454005   0.09489029
## Cramer-von Mises statistic    1.2633074   0.07235833
## Anderson-Darling statistic          Inf   0.55695712
## 
## Goodness-of-fit criteria
##                                1-mle-triang 2-mle-norm 3-mle-weibull
## Akaike's Information Criterion     430.3380   432.8627      426.5458
## Bayesian Information Criterion     434.8275   435.8557      429.5388
##                                4-mle-lnorm 5-mle-gamma 6-mle-llogis 7-mle-exp
## Akaike's Information Criterion    428.2667    426.4196     429.7214  438.9682
## Bayesian Information Criterion    431.2597    429.4126     432.7144  440.4647
##                                8-mle-pareto 9-mle-unif 10-mle-logis
## Akaike's Information Criterion     440.9682   433.7752     433.2276
## Bayesian Information Criterion     443.9612   436.7682     436.2207
#compare GOF and select the best three fitted distributions

#based on GOF and graph, both gamma and weibull distribution fits well over whole range. 
#For our example, weibull distribution was chosed

#Step 10: based on GOF and graph select the best fitted continuous distribution
# Note the estimated parameters with the standard error from summary statistics
# standard error can be used in 2D montecarlo simulation as uncertinities

#Step 11: Uncertinities in the estimated parameters for the best selected distribution can also be estimated by bootstrapping


#So the selected distribution for acrylamide concentration data is weibul distribution with the error in the parameters as follows:
uncertinity <-bootdist(fw,niter = 1001)
summary(uncertinity)
## Parametric bootstrap medians and 95% percentile CI 
##           Median       2.5%      97.5%
## shape   1.858237   1.458437   2.513762
## scale 310.359322 251.745487 371.033455
plot(uncertinity)

#######################

#Fitting of distribution of the consumption data

##################



#Step 2: #Specify as the "consumptiondata" to the data to fit different distribution
Consumptiondata <-fries_consumption_data$`weekly consumption of fried French fries  kgperkgbwperweek`
summary(Consumptiondata)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## 0.000000 0.003750 0.006933 0.007734 0.011767 0.025050
#Step 3: #Load differnt relevent libraries

library("fitdistrplus")
library("actuar")
library("stats")
library("MASS")
library("survival")
library("triangle")
library("mc2d")

#Always maximize the window of plot till the top before running the program
#If the window is small, graph wont fit in that size and will give error

#A. FITTING CONTINUOUS DISTRIBUTION is FROM step 4 to 10
#B. FITTING DISCREATE DISTRIBUTION IS FROM STEP 11

# LETS START WITH CONTINUOUS DISTRIBUTION FITTING
# A. FITTING CONTINUOUS DISTRIBUTION

#Step 4: # skewness-kurtosis plot of the data
par(mfrow=c(1,1)) # make 1 graph in one page
descdist(Consumptiondata, boot = 1000)

## summary statistics
## ------
## min:  0   max:  0.02505 
## median:  0.006933333 
## mean:  0.007734496 
## estimated sd:  0.006050041 
## estimated skewness:  0.7031494 
## estimated kurtosis:  3.32809
plotdist(Consumptiondata)

#Step 5:# FITTING DISTRIBUTION
# fit one distribution at a time and check graphs and fitting (summary statistics are done at later stage)

#1. Fitting Triangle distributio
ft=fitdist(Consumptiondata, "triang", start = list(min=0, max=0.02505, mode=0.0001))
## Warning in checkparamlist(arg_startfix$start.arg, arg_startfix$fix.arg, : Some
## parameter names have no starting/fixed value but have a default value: mean.
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in sqrt(diag(varcovar)): NaNs produced
# IF error, please check if there is infinity in the the actual log-likelihoods
do.call(dtriang,c(list(x=Consumptiondata,log=TRUE),list(min=0, max=0.02505, mode=0.0001)))
##  [1]     -Inf     -Inf     -Inf     -Inf     -Inf     -Inf     -Inf     -Inf
##  [9] 4.241235 4.241235 4.241235 4.202106 4.202106 4.202106 4.181952 4.181952
## [17] 4.181952 4.161384 4.160552 4.157219 4.063660 4.059987 3.989482 3.979561
## [25] 3.969541 3.965505 3.954320 3.943009 3.885546 3.817570 3.769541 3.757165
## [33] 3.742108 3.699494 3.688883 3.684875 3.624167 3.487175 3.469068 3.283418
## [41] 3.238421 2.590271     -Inf
# change the starting condition based on summary statistics
plot(ft) # histogram, QQ, CDF and PP plot at once

#2. fitting normal distribution
fn <-fitdist(Consumptiondata,"norm")
plot(fn) # histogram, QQ, CDF and PP plot at once

#5. Fitting gamma distribution
fg <- fitdist(Consumptiondata, "gamma")
plot(fg) # histogram, QQ, CDF and PP plot at once

#7. Fitting exponential distribution
fe <-fitdist(Consumptiondata,"exp")
plot(fe) # histogram, QQ, CDF and PP plot at once

#8. Fitting pareto distribution
fp <- fitdist(Consumptiondata, "pareto", start = list(shape = 1, scale = 500))
## Warning in cov2cor(varcovar): diag(V) had non-positive or NA entries; the
## non-finite result may be dubious
## Warning in cov2cor(varcovar): NaNs produced
plot(fp) # histogram, QQ, CDF and PP plot at once

#11. Fitting uniform distribution 
fu <-fitdist(Consumptiondata,"unif")
plot(fu) # histogram, QQ, CDF and PP plot at once

#12. Fitting logistic distribution
fl <-fitdist(Consumptiondata,"logis")
plot(fl) # histogram, QQ, CDF and PP plot at once

#STEP 6: GRAPHICAL COMPARISON OF DIFFERENT CONTINUOUS FITTING DISTRIBUTION
par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#A.comparison of density plot of different fitted distribution
denscomp(ft, legendtext = "triangle")
denscomp(fn, legendtext = "normal")
denscomp(fg, legendtext = "gamma")
denscomp(fe, legendtext = "exponential") 

denscomp(fp, legendtext = "pareto") 
denscomp(fu, legendtext = "uniform") 
denscomp(fl, legendtext = "logistic") 

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#B. Comparison of cdf plot comparions of different fiited distribution
cdfcomp(ft, legendtext = "triangle") # Cumulative Distribution plot
cdfcomp(fn, legendtext = "normal") # Cumulative Distribution plot
cdfcomp(fg, legendtext = "gamma") # Cumulative Distribution plot
cdfcomp(fe, legendtext = "exponential") # Cumulative Distribution plot

cdfcomp(fp, legendtext = "pareto") # Cumulative Distribution plot
cdfcomp(fu, legendtext = "uniform") # Cumulative Distribution plot
cdfcomp(fl, legendtext = "logistic") # Cumulative Distribution plot

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#C.Comparison of QQ plot comparions of different fiited distribution
qqcomp(ft, legendtext = "triangle") #QQplot
qqcomp(fn, legendtext = "normal")
qqcomp(fg, legendtext = "gamma")
qqcomp(fe, legendtext = "exponential") 

qqcomp(fp, legendtext = "pareto") 
qqcomp(fu, legendtext = "uniform") 
qqcomp(fl, legendtext = "logistic") 

par(mfrow=c(2,2)) # make graph 2 by 2 in one page (4 in one page)

#D. Comparison of PP plot comparions of different fiited distribution
ppcomp(ft, legendtext = "triangle") #PP plot
ppcomp(fn, legendtext = "normal")
ppcomp(fg, legendtext = "gamma")
ppcomp(fe, legendtext = "exponential") 

ppcomp(fp, legendtext = "pareto") 
ppcomp(fu, legendtext = "uniform") 
ppcomp(fl, legendtext = "logistic") 

par(mfrow=c(1,1)) # make graph 1 by 1 in one page (1 in one page)

#Step 7: comparison of summary statistics of different fit
# comparison of goodness of fit with loglikleyhood, AIC and BIC values
# choose the distribution with low AIC and BIC values (best fit)
summary(ft) #fitted traingle distribution
## Fitting of the distribution ' triang ' by maximum likelihood 
## Parameters : 
##       estimate   Std. Error
## min  -0.002505 6.481336e-04
## max   0.026720 6.581590e-09
## mode  0.001770          NaN
## Loglikelihood:  161.605   AIC:  -317.21   BIC:  -311.9264 
## Correlation matrix:
##                min           max mode
## min   1.000000e+00 -6.515768e-07  NaN
## max  -6.515768e-07  1.000000e+00  NaN
## mode           NaN           NaN    1
summary(fn) #fitted normal distribution
## Fitting of the distribution ' norm ' by maximum likelihood 
## Parameters : 
##         estimate   Std. Error
## mean 0.007734496 0.0009118313
## sd   0.005979278 0.0005653711
## Loglikelihood:  159.1222   AIC:  -314.2445   BIC:  -310.7221 
## Correlation matrix:
##      mean sd
## mean    1  0
## sd      0  1
summary(fg) #fitted gamma distribution
## Fitting of the distribution ' gamma ' by maximum likelihood 
## Parameters : 
##         estimate
## shape   1.634359
## rate  211.307741
## Loglikelihood:  -92341796821   AIC:  184683593646   BIC:  184683593650
summary(fe) #fitted exponential distribution
## Fitting of the distribution ' exp ' by maximum likelihood 
## Parameters : 
##      estimate Std. Error
## rate 129.2909   19.71672
## Loglikelihood:  166.0688   AIC:  -330.1376   BIC:  -328.3764
summary(fp) #fitted pareto distribution
## Fitting of the distribution ' pareto ' by maximum likelihood 
## Parameters : 
##         estimate Std. Error
## shape 3769291.94        NaN
## scale   29150.28   4576.787
## Loglikelihood:  166.0688   AIC:  -328.1376   BIC:  -324.6152 
## Correlation matrix:
##       shape scale
## shape     1   NaN
## scale   NaN     1
summary(fu) #fitted uniform distribution
## Fitting of the distribution ' unif ' by maximum likelihood 
## Parameters : 
##     estimate
## min  0.00000
## max  0.02505
## Loglikelihood:  158.5359   AIC:  -313.0718   BIC:  -309.5494
summary(fl) #fitted logistic distribution
## Fitting of the distribution ' logis ' by maximum likelihood 
## Parameters : 
##             estimate   Std. Error
## location 0.007351559 0.0009194191
## scale    0.003424008 0.0003156140
## Loglikelihood:  158.7641   AIC:  -313.5282   BIC:  -310.0058 
## Correlation matrix:
##            location      scale
## location 1.00000000 0.03737763
## scale    0.03737763 1.00000000
#Step 8: comparions of continuous fitted distribution with goodness of fit statistics
#please remove the fit inside the list below, if that one could not have been fitted above
gofstat(list(ft,fn,fg,fe,fp,fu,fl))
## Goodness-of-fit statistics
##                              1-mle-triang 2-mle-norm 3-mle-gamma 4-mle-exp
## Kolmogorov-Smirnov statistic    0.1358209 0.13637510   0.1860465 0.1860465
## Cramer-von Mises statistic      0.1374924 0.09027371   0.1716586 0.3203760
## Anderson-Darling statistic      1.1366162 0.66799939         Inf       Inf
##                              5-mle-pareto 6-mle-unif 7-mle-logis
## Kolmogorov-Smirnov statistic    0.1860465  0.3342153  0.12492624
## Cramer-von Mises statistic      0.3205255  1.9029459  0.08168584
## Anderson-Darling statistic            Inf        Inf  0.62219988
## 
## Goodness-of-fit criteria
##                                1-mle-triang 2-mle-norm  3-mle-gamma 4-mle-exp
## Akaike's Information Criterion    -317.2100  -314.2445 184683593646 -330.1376
## Bayesian Information Criterion    -311.9264  -310.7221 184683593650 -328.3764
##                                5-mle-pareto 6-mle-unif 7-mle-logis
## Akaike's Information Criterion    -328.1376  -313.0718   -313.5282
## Bayesian Information Criterion    -324.6152  -309.5494   -310.0058
#compare GOF and select the best three fitted distributions

#based on GOF and graph, both logistic and normal distribution fits well over whole range. 
#Due to lower AIC value, logistic distribution was chosed

#Step 10: based on GOF and graph select the best fitted continuous distribution
# Note the estimated parameters with the standard error from summary statistics
# standard error can be used in 2D montecarlo simulation as uncertinities

#Step 11: Uncertinities in the estimated parameters for the best selected distribution can also be estimated by bootstrapping


#So the selected distribution for fries consumption data is logistic distribution with the error in the parameters as follows:
uncertinity <-bootdist(fl,niter = 1001)
summary(uncertinity)
## Parametric bootstrap medians and 95% percentile CI 
##               Median        2.5%       97.5%
## location 0.007338434 0.005696604 0.009093599
## scale    0.003385162 0.002596865 0.004306786
plot(uncertinity)

##################

#Exposure Assessment

###########

library(mc2d)
#one dimensional montecarlo simulation
ndvar(100001)
## [1] 100001
concentration_AA_mcgperkg <-mcstoc(rweibull,shape=1.806815,scale=311.503027)
concentration_AA_mcgperkg[concentration_AA_mcgperkg<0] <-0
consumption_fries_kgperkgbwperweek <-mcstoc(rlogis, location= 0.007351559, scale= 0.003424008)
consumption_fries_kgperkgbwperweek[consumption_fries_kgperkgbwperweek<0] <- 0
#mcg/kgbw/day
exposure <-concentration_AA_mcgperkg*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp1 <-mc(concentration_AA_mcgperkg,consumption_fries_kgperkgbwperweek, exposure)
print(Exp1)
##                                 node    mode    nsv nsu nva variate   min
## 1          concentration_AA_mcgperkg numeric 100001   1   1       1 0.542
## 2 consumption_fries_kgperkgbwperweek numeric 100001   1   1       1 0.000
## 3                           exposure numeric 100001   1   1       1 0.000
##       mean   median      max Nas type outm
## 1 2.77e+02 2.53e+02 1.17e+03   0    V each
## 2 7.73e-03 7.34e-03 5.06e-02   0    V each
## 3 3.05e-01 2.24e-01 3.90e+00   0    V each
summary(Exp1)
## concentration_AA_mcgperkg :
##       mean  sd   Min 2.5% 25% 50% 75% 97.5%  Max   nsv Na's
## NoUnc  277 159 0.542 40.6 156 253 373   644 1174 1e+05    0
## 
## consumption_fries_kgperkgbwperweek :
##          mean     sd Min 2.5%    25%     50%    75%  97.5%    Max   nsv Na's
## NoUnc 0.00773 0.0055   0    0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05    0
## 
## exposure :
##        mean    sd Min 2.5%    25%   50%   75% 97.5% Max   nsv Na's
## NoUnc 0.305 0.306   0    0 0.0824 0.224 0.434   1.1 3.9 1e+05    0
hist(Exp1)

plot.plotmc(Exp1)

summary(concentration_AA_mcgperkg)
## node :
##       mean  sd   Min 2.5% 25% 50% 75% 97.5%  Max   nsv Na's
## NoUnc  277 159 0.542 40.6 156 253 373   644 1174 1e+05    0
summary(consumption_fries_kgperkgbwperweek)
## node :
##          mean     sd Min 2.5%    25%     50%    75%  97.5%    Max   nsv Na's
## NoUnc 0.00773 0.0055   0    0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05    0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day

# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.

# lets assume,lower BMDL 170 mcg per kg bw per day 
quartile<-quantile(exposure, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile<-as.data.frame(quartile)
print(quartile)
##       exposure.5. exposure.10. exposure.15. exposure.20. exposure.25.
## NoInc           0            0   0.02917884   0.05598821   0.08238223
##       exposure.50. exposure.75. exposure.80. exposure.85. exposure.90.
## NoInc    0.2242505      0.43371    0.4977718    0.5819165    0.6996181
##       exposure.95. exposure.97. exposure.100.
## NoInc    0.9007259     1.052152      3.901302
MOE<-170/quartile
print(MOE)
##       exposure.5. exposure.10. exposure.15. exposure.20. exposure.25.
## NoInc         Inf          Inf      5826.14     3036.354     2063.552
##       exposure.50. exposure.75. exposure.80. exposure.85. exposure.90.
## NoInc     758.0809      391.967     341.5219     292.1381     242.9897
##       exposure.95. exposure.97. exposure.100.
## NoInc     188.7367     161.5736       43.5752
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS

sum(exposure<0.017)/nrow(exposure)*100
## [1] 13.00587
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000

sum(exposure>0.17)/nrow(exposure)*100
## [1] 59.13141
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100

sum(exposure>1.7)/nrow(exposure)*100
## [1] 0.3549965
######
# SCENARIO #1 LETS ASSUME THAT HEATING UNDER 175C REDUCES AA CONCENTRATION BY 10 %
#one dimensional montecarlo simulation
concentration_AA_mcgperkg_1 <-concentration_AA_mcgperkg *0.9
concentration_AA_mcgperkg_1[concentration_AA_mcgperkg_1<0] <-0

#mcg/kgbw/day
exposure_1 <-concentration_AA_mcgperkg_1*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp2 <-mc(concentration_AA_mcgperkg_1,consumption_fries_kgperkgbwperweek, exposure_1)
print(Exp2)
##                                 node    mode    nsv nsu nva variate   min
## 1        concentration_AA_mcgperkg_1 numeric 100001   1   1       1 0.488
## 2 consumption_fries_kgperkgbwperweek numeric 100001   1   1       1 0.000
## 3                         exposure_1 numeric 100001   1   1       1 0.000
##       mean   median      max Nas type outm
## 1 2.49e+02 2.28e+02 1.06e+03   0    V each
## 2 7.73e-03 7.34e-03 5.06e-02   0    V each
## 3 2.75e-01 2.02e-01 3.51e+00   0    V each
summary(Exp2)
## concentration_AA_mcgperkg_1 :
##       mean  sd   Min 2.5% 25% 50% 75% 97.5%  Max   nsv Na's
## NoUnc  249 143 0.488 36.5 140 228 335   579 1056 1e+05    0
## 
## consumption_fries_kgperkgbwperweek :
##          mean     sd Min 2.5%    25%     50%    75%  97.5%    Max   nsv Na's
## NoUnc 0.00773 0.0055   0    0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05    0
## 
## exposure_1 :
##        mean    sd Min 2.5%    25%   50%  75% 97.5%  Max   nsv Na's
## NoUnc 0.275 0.275   0    0 0.0741 0.202 0.39 0.993 3.51 1e+05    0
hist(Exp2)

plot.plotmc(Exp2)

summary(concentration_AA_mcgperkg_1)
## node :
##       mean  sd   Min 2.5% 25% 50% 75% 97.5%  Max   nsv Na's
## NoUnc  249 143 0.488 36.5 140 228 335   579 1056 1e+05    0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day

# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.

# lets assume,lower BMDL 170 mcg per kg bw per day 
quartile_1<-quantile(exposure_1, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile_1<-as.data.frame(quartile_1)
print(quartile_1)
##       exposure_1.5. exposure_1.10. exposure_1.15. exposure_1.20. exposure_1.25.
## NoInc             0              0     0.02626096     0.05038939     0.07414401
##       exposure_1.50. exposure_1.75. exposure_1.80. exposure_1.85.
## NoInc      0.2018254       0.390339      0.4479947      0.5237248
##       exposure_1.90. exposure_1.95. exposure_1.97. exposure_1.100.
## NoInc      0.6296563      0.8106533       0.946937        3.511172
MOE_1<-170/quartile_1
print(MOE_1)
##       exposure_1.5. exposure_1.10. exposure_1.15. exposure_1.20. exposure_1.25.
## NoInc           Inf            Inf       6473.489       3373.726       2292.835
##       exposure_1.50. exposure_1.75. exposure_1.80. exposure_1.85.
## NoInc       842.3121       435.5188       379.4688       324.5979
##       exposure_1.90. exposure_1.95. exposure_1.97. exposure_1.100.
## NoInc       269.9886       209.7074       179.5262        48.41688
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS

sum(exposure_1<0.017)/nrow(exposure_1)*100
## [1] 13.28587
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000

sum(exposure_1>0.17)/nrow(exposure_1)*100
## [1] 55.83244
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100

sum(exposure_1>1.7)/nrow(exposure_1)*100
## [1] 0.1899981
######
# SCENARIO #2 LETS ASSUME THAT HEATING UNDER 175C IN THE PRESENCE OF CITRIC ACID REDUCES AA CONCENTRATION BY 90 %
#one dimensional montecarlo simulation
concentration_AA_mcgperkg_2 <-concentration_AA_mcgperkg *0.1
concentration_AA_mcgperkg_2[concentration_AA_mcgperkg_2<0] <-0

#mcg/kgbw/day
exposure_2 <-concentration_AA_mcgperkg_2*consumption_fries_kgperkgbwperweek/7
exposure[exposure<0] <-0
Exp3 <-mc(concentration_AA_mcgperkg_2,consumption_fries_kgperkgbwperweek, exposure_2)
print(Exp3)
##                                 node    mode    nsv nsu nva variate    min
## 1        concentration_AA_mcgperkg_2 numeric 100001   1   1       1 0.0542
## 2 consumption_fries_kgperkgbwperweek numeric 100001   1   1       1 0.0000
## 3                         exposure_2 numeric 100001   1   1       1 0.0000
##       mean   median      max Nas type outm
## 1 27.66696 25.31538 117.3652   0    V each
## 2  0.00773  0.00734   0.0506   0    V each
## 3  0.03052  0.02243   0.3901   0    V each
summary(Exp3)
## concentration_AA_mcgperkg_2 :
##       mean   sd    Min 2.5%  25%  50%  75% 97.5% Max   nsv Na's
## NoUnc 27.7 15.9 0.0542 4.06 15.6 25.3 37.3  64.4 117 1e+05    0
## 
## consumption_fries_kgperkgbwperweek :
##          mean     sd Min 2.5%    25%     50%    75%  97.5%    Max   nsv Na's
## NoUnc 0.00773 0.0055   0    0 0.0036 0.00734 0.0111 0.0199 0.0506 1e+05    0
## 
## exposure_2 :
##         mean     sd Min 2.5%     25%    50%    75% 97.5%  Max   nsv Na's
## NoUnc 0.0305 0.0306   0    0 0.00824 0.0224 0.0434  0.11 0.39 1e+05    0
hist(Exp3)

plot.plotmc(Exp3)

summary(concentration_AA_mcgperkg_2)
## node :
##       mean   sd    Min 2.5%  25%  50%  75% 97.5% Max   nsv Na's
## NoUnc 27.7 15.9 0.0542 4.06 15.6 25.3 37.3  64.4 117 1e+05    0
# RISK CHARACTERISATION calculation of MOE
# For tumours, experts selected a BMDL10 of 0.17mg/kg bw/day

# For other effects, neurological changes were seen as the
# most relevant with a BMDL10 of 0.43 mg/kg bw/day.

# lets assume,lower BMDL 170 mcg per kg bw per day 

quartile_2<-quantile(exposure_2, probs = c(0.05,0.1,0.15,0.2,0.25, 0.5, 0.75,0.8,0.85,0.9,0.95,0.97,1.0))
quartile_2<-as.data.frame(quartile_2)
print(quartile_2)
##       exposure_2.5. exposure_2.10. exposure_2.15. exposure_2.20. exposure_2.25.
## NoInc             0              0    0.002917884    0.005598821    0.008238223
##       exposure_2.50. exposure_2.75. exposure_2.80. exposure_2.85.
## NoInc     0.02242505       0.043371     0.04977718     0.05819165
##       exposure_2.90. exposure_2.95. exposure_2.97. exposure_2.100.
## NoInc     0.06996181     0.09007259      0.1052152       0.3901302
MOE_2<-170/quartile_2
print(MOE_2)
##       exposure_2.5. exposure_2.10. exposure_2.15. exposure_2.20. exposure_2.25.
## NoInc           Inf            Inf        58261.4       30363.54       20635.52
##       exposure_2.50. exposure_2.75. exposure_2.80. exposure_2.85.
## NoInc       7580.809        3919.67       3415.219       2921.381
##       exposure_2.90. exposure_2.95. exposure_2.97. exposure_2.100.
## NoInc       2429.897       1887.367       1615.736         435.752
# For MOE to be above 10000 (SAFE), exposure should be below the 170/10000 = 0.017
# lets find the percentage of population with exposure below 0.017 (means MOE higher than 10,000)
# Percent SAFE POPULATION IS

sum(exposure_2<0.017)/nrow(exposure_2)*100
## [1] 40.86859
#For MOE to be below 1000, exposure should be above the 170/1000 = 0.17
# lets find the percentage of population with exposure above 0.17 (means MOE lower than 1000)
# Percent POPULATION with MOE lower than 1000

sum(exposure_2>0.17)/nrow(exposure_2)*100
## [1] 0.3549965
#For MOE to be below 100, exposure should be above the 170/100 = 1.7
# lets find the percentage of population with exposure above 1.7 (means MOE lower than 100)
# Percent POPULATION with MOE lower than 100

sum(exposure_2>1.7)/nrow(exposure_2)*100
## [1] 0
# summary
# Create a comprehensive comparison table
comparison_table <- data.frame(
  Scenario = c("Baseline", "Scenario 1 (10% reduction)", "Scenario 2 (90% reduction)"),
  Median_Exposure = c(
    median(exposure),
    median(exposure_1), 
    median(exposure_2)
  ),
  P95_Exposure = c(
    quantile(exposure, 0.95),
    quantile(exposure_1, 0.95),
    quantile(exposure_2, 0.95)
  ),
  Percent_Safe = c(
    sum(exposure < 0.017) / nrow(exposure) * 100,
    sum(exposure_1 < 0.017) / nrow(exposure_1) * 100,
    sum(exposure_2 < 0.017) / nrow(exposure_2) * 100
  ),
  Percent_Potential_Concern = c(
    sum(exposure > 0.17) / nrow(exposure) * 100,
    sum(exposure_1 > 0.17) / nrow(exposure_1) * 100,
    sum(exposure_2 > 0.17) / nrow(exposure_2) * 100
  ),
  Percent_High_Concern = c(
    sum(exposure > 1.7) / nrow(exposure) * 100,
    sum(exposure_1 > 1.7) / nrow(exposure_1) * 100,
    sum(exposure_2 > 1.7) / nrow(exposure_2) * 100
  )
)
## Warning in data.frame(Scenario = c("Baseline", "Scenario 1 (10% reduction)", :
## row names were found from a short variable and have been discarded
print("Comprehensive Scenario Comparison:")
## [1] "Comprehensive Scenario Comparison:"
print(comparison_table)
##                     Scenario Median_Exposure P95_Exposure.95.
## 1                   Baseline      0.22425048        0.9007259
## 2 Scenario 1 (10% reduction)      0.20182543        0.9007259
## 3 Scenario 2 (90% reduction)      0.02242505        0.9007259
##   P95_Exposure.95..1 P95_Exposure.95..2 Percent_Safe Percent_Potential_Concern
## 1          0.8106533         0.09007259     13.00587                59.1314087
## 2          0.8106533         0.09007259     13.28587                55.8324417
## 3          0.8106533         0.09007259     40.86859                 0.3549965
##   Percent_High_Concern
## 1            0.3549965
## 2            0.1899981
## 3            0.0000000
library(ggplot2)
library(gridExtra)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:gridExtra':
## 
##     combine
## The following object is masked from 'package:MASS':
## 
##     select
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
# Prepare data for plotting
exposure_data <- data.frame(
  Exposure = c(exposure, exposure_1, exposure_2),
  Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"), 
                 each = length(exposure))
)

# Create density plot
p1 <- ggplot(exposure_data, aes(x = Exposure, fill = Scenario)) +
  geom_density(alpha = 0.6) +
  scale_x_continuous(trans = 'log10', limits = c(1e-4, 10)) +
  geom_vline(xintercept = 0.017, linetype = "dashed", color = "green", size = 1) +
  geom_vline(xintercept = 0.17, linetype = "dashed", color = "orange", size = 1) +
  geom_vline(xintercept = 1.7, linetype = "dashed", color = "red", size = 1) +
  labs(title = "Exposure Distribution Across Scenarios",
       x = "Daily Exposure (μg/kg bw/day, log scale)",
       y = "Density") +
  theme_minimal() +
  annotate("text", x = 0.01, y = 1, label = "MOE > 10,000", color = "green", size = 3) +
  annotate("text", x = 0.1, y = 1, label = "MOE < 1,000", color = "orange", size = 3) +
  annotate("text", x = 2, y = 1, label = "MOE < 100", color = "red", size = 3)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(p1)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_density()`).

# Calculate MOE for all scenarios
MOE_baseline <- 170 / exposure
MOE_scenario1 <- 170 / exposure_1
MOE_scenario2 <- 170 / exposure_2

moe_data <- data.frame(
  MOE = c(MOE_baseline, MOE_scenario1, MOE_scenario2),
  Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"), 
                 each = length(MOE_baseline))
)

# Remove infinite values for plotting
moe_data <- moe_data[is.finite(moe_data$MOE), ]

p2 <- ggplot(moe_data, aes(x = Scenario, y = MOE, fill = Scenario)) +
  geom_boxplot(alpha = 0.7) +
  scale_y_continuous(trans = 'log10', limits = c(10, 100000)) +
  geom_hline(yintercept = 100, linetype = "dashed", color = "red", size = 1) +
  geom_hline(yintercept = 1000, linetype = "dashed", color = "orange", size = 1) +
  geom_hline(yintercept = 10000, linetype = "dashed", color = "green", size = 1) +
  labs(title = "Margin of Exposure (MOE) Comparison",
       y = "MOE (log scale)") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  annotate("text", x = 1.5, y = 50, label = "MOE < 100", color = "red", size = 3) +
  annotate("text", x = 1.5, y = 500, label = "MOE < 1,000", color = "orange", size = 3) +
  annotate("text", x = 1.5, y = 5000, label = "MOE > 10,000", color = "green", size = 3)

print(p2)
## Warning: Removed 2906 rows containing non-finite outside the scale range
## (`stat_boxplot()`).

# Calculate risk categories
risk_categories <- data.frame(
  Scenario = rep(c("Baseline", "Scenario 1", "Scenario 2"), each = 3),
  Risk_Level = rep(c("Safe (MOE > 10,000)", "Potential Concern (MOE < 1,000)", "High Concern (MOE < 100)"), 3),
  Percentage = c(
    sum(exposure < 0.017) / nrow(exposure) * 100,
    sum(exposure > 0.17) / nrow(exposure) * 100,
    sum(exposure > 1.7) / nrow(exposure) * 100,
    sum(exposure_1 < 0.017) / nrow(exposure_1) * 100,
    sum(exposure_1 > 0.17) / nrow(exposure_1) * 100,
    sum(exposure_1 > 1.7) / nrow(exposure_1) * 100,
    sum(exposure_2 < 0.017) / nrow(exposure_2) * 100,
    sum(exposure_2 > 0.17) / nrow(exposure_2) * 100,
    sum(exposure_2 > 1.7) / nrow(exposure_2) * 100
  )
)

p3 <- ggplot(risk_categories, aes(x = Scenario, y = Percentage, fill = Risk_Level)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Population Distribution Across Risk Categories",
       y = "Percentage of Population (%)",
       fill = "Risk Category") +
  theme_minimal() +
  scale_fill_manual(values = c("Safe (MOE > 10,000)" = "green", 
                              "Potential Concern (MOE < 1,000)" = "orange", 
                              "High Concern (MOE < 100)" = "red")) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

print(p3)

# Create cumulative distribution plot
p4 <- ggplot(exposure_data, aes(x = Exposure, color = Scenario)) +
  stat_ecdf(size = 1) +
  scale_x_continuous(trans = 'log10', limits = c(1e-4, 10)) +
  geom_vline(xintercept = 0.017, linetype = "dashed", color = "green", size = 0.5) +
  geom_vline(xintercept = 0.17, linetype = "dashed", color = "orange", size = 0.5) +
  geom_vline(xintercept = 1.7, linetype = "dashed", color = "red", size = 0.5) +
  labs(title = "Cumulative Distribution of Exposure",
       x = "Daily Exposure (μg/kg bw/day, log scale)",
       y = "Cumulative Probability") +
  theme_minimal()

print(p4)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_ecdf()`).

# Create a comprehensive dashboard
library(patchwork)
## 
## Attaching package: 'patchwork'
## The following object is masked from 'package:MASS':
## 
##     area
# Arrange all plots
combined_plot <- (p1 | p2) / (p3 | p4) + 
  plot_annotation(title = "Acrylamide Risk Assessment: Scenario Comparison Dashboard",
                  theme = theme(plot.title = element_text(hjust = 0.5, size = 16)))

print(combined_plot)
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_density()`).
## Warning: Removed 2906 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning in scale_x_continuous(trans = "log10", limits = c(1e-04, 10)): log-10
## transformation introduced infinite values.
## Warning: Removed 31803 rows containing non-finite outside the scale range
## (`stat_ecdf()`).

# Print summary table
print("Summary Comparison Table:")
## [1] "Summary Comparison Table:"
print(comparison_table)
##                     Scenario Median_Exposure P95_Exposure.95.
## 1                   Baseline      0.22425048        0.9007259
## 2 Scenario 1 (10% reduction)      0.20182543        0.9007259
## 3 Scenario 2 (90% reduction)      0.02242505        0.9007259
##   P95_Exposure.95..1 P95_Exposure.95..2 Percent_Safe Percent_Potential_Concern
## 1          0.8106533         0.09007259     13.00587                59.1314087
## 2          0.8106533         0.09007259     13.28587                55.8324417
## 3          0.8106533         0.09007259     40.86859                 0.3549965
##   Percent_High_Concern
## 1            0.3549965
## 2            0.1899981
## 3            0.0000000

No comments:

Post a Comment

Popular Posts