diff --git a/doc/src/parts/faq.rst b/doc/src/parts/faq.rst
new file mode 100644
index 00000000..8aa0d76e
--- /dev/null
+++ b/doc/src/parts/faq.rst
@@ -0,0 +1,198 @@
+.. meta::
+   :description lang=en: kafe2 - a Python-package for fitting parametric
+                         models to several types of data with
+   :robots: index, follow
+
+.. _faq:
+
+**********
+FAQ
+**********
+
+
+General fitting
+===============
+
+**Why do I have to specify uncertainties for my datapoints in a kafe2 fit?**
+
+**What happens when I don't specify errors in kafe2?**
+
+**Why does kafe2 give warnings about missing uncertaintites.**
+
+In general, the goal of a fitting algorithm is to minize the so called cost function.
+This can for example be a :math:`\chi^2`-function or a negative log Likelihood function.
+All of these cost functions include the uncertainties of the datapoints.
+A more detailed explanations of the fitting procedure and cost functions
+can be found in the :ref:`Mathematical Foundations <mathematical_foundations>` section. Not specifying
+an uncertainty, would leave this parameter open and the minimum of the cost function can not be
+determined.
+
+
+**Why can I not just set the uncertainty to zero?**
+
+**Why does kafe2 give an infinite cost function warning?**
+
+If one or more datapoint has an uncertainty of zero, the fit will not work and give you warnings.
+E.g. in the :math:`\chi^2`-function in section :ref:`Mathematical Foundations <mathematical_foundations>`, the
+uncertainties appear in the denominator. Setting any uncertainty to zero will result in a division by zero.
+This will make the cost function go to infinity for every combination of parameters. Finding a minimum is impossible
+and the fit can not converge.
+
+
+
+**Why does kafe2 require measurment errors, when scipy doesn't?**
+
+**What can I do, if I don't have any errors from my measurement?**
+
+Some fitting algorithms, like those in SciPy, allow users to leave uncertainties unspecified.
+In this case, SciPy assumes all errors to be equal to one and in the end rescales them such that
+the :math:`\chi^2`/ndf value,is close to one.
+This destroys your measure to compare different models for an experiment (generally the goal of a fit in physics),
+because all models will yield the same goodness of fit value. In principle you could do the same manually using kafe2.
+But you are encouraged not to. In principle every phyiscal measurement yields uncertainties, systematical und statistical ones,
+e.g. since measuring devices are not perfect.
+
+
+My fit does not converge
+========================
+
+**I use a correct model, but the best fit values are always off from the values I would expect. What could cause this?**
+
+**How do starting values influence kafe2 fits?**
+
+**How can I improve convergence, if the fit result seems wrong?**
+
+One reason for this could be unspecified or poorly chosen starting values. The fitting algorithm minimizes the
+cost function numerically,whcih requires a stariting value for each parameter.
+Ideally this starting value is already close to the true value of a parameter. 
+If the starting value is too far off, the minimizer searches in the wrong region and will maybe only find a local minimum,
+while we of course search for the global one.
+The starting values can be defined, by specifying default values in the model function.
+
+
+**The parameter, I want to estimate from my fit is small, and the fit doesn't find it. How can I fix this?**
+
+**How can I make the minimizer detect very small parameters?**
+
+**How to adjust the Parameter Range in a kafe2 fit?**
+
+A common problem is the step size of the minimizer.
+Consider a parameter, that has a true value of 0.5, but the minimizer only looks at integers. It will miss the global
+minimum and may find some other local minimum or not converge at all. Of course the minimizer used in kafe2
+is a bit more sophisticated and will not only look at integers. But if the step size of the minimizer is too large
+relatively to the parameters scale, similar problems can still occur.
+To control the initial step size, it is possible to limit the parameters to some expected range using the
+:py:meth:`~.Fit.limit_parameter` method.
+For instance if you expect a parameter to have a value around :math:`5*10^(-10)`. It does not make sense
+for the minimizer to search between 0 and 1.Searching between :math:`1*10^(-10)` and :math:`1*10^(-9)` will improve the 
+chance of finding the correct minimum.
+
+
+**The numerical values of my x and y measurements are separated by many orders of magnitude. Can this influence my fit result?**
+
+**How can I handle parameters that differ by many orders of magnitude?**
+
+Yes, large differences in the scale of the data can cause Problems.
+This can e.g. occur when fitting Planck's constant h.
+Here the frequency is measured in Hertz (Order of 10^15) and the electron energy is measured in Joule (Order of 10^(-18)).
+A step on the frequency axis requires very large changes to affect the output significantly, while tiny steps on the
+energy axis can already lead to large changes of the output. This imbalance makes it hard for the minimizer to navigate
+the parameter space, due to numerical instability. 
+An easy solution is to rescale the data such that x and y values are in a more similar regime. In the case of Planck's constant
+it can already be enough to specify the energy in eV, reducing the difference in orders of magnitude by 19.
+
+
+Histogram Fits
+==============
+
+**When should I use a Histogram fit?**
+
+**Why does my fit take so long?**
+
+**Why is an :py:object:`XYFit` slow for large datasets?**
+
+When large numbers of datapoints are given, the usual :py:object:`XYFit` will get computationally very
+expensive. A practical solution to this is to fill the data into a smaller number of bins.
+This reduces the number of datapoints significantly and thus reduces the amount of computing power necessary. 
+For the now histogrammed data, a Histogram fit can be used.
+
+
+**What is the difference between a :py:object:`HistFit` and a :py:object:`XYFit` ?**
+
+**What is special about the cost function in a Histogram Fit?**
+
+A :py:object:`HistFit` and a :py:object:`XYFit` mainly differn in how they handle the data and statistical uncertainties.
+The data has to be passed to the :py:object:`HistFit` in a :py:object:`HistContainer`.
+This Container type can also directly histogram your raw data. The default cost function
+of the Histogram Fit is a Poisson Likelihood compared to a :math:`\chi^2`-function in the case of an :py:object:`XYFit`.
+This is important for the correct handling of statistical uncertainties, especially when dealing with empy bins.
+In Histogram fits, the statistical uncertainty is directly calculated from the model function by default, whichreduces biases.
+In contrast the :py:object:`XYFit` assumes gaussian uncertainties by default and handels uncertainties point by point.
+
+
+**Do I have to specify errors for my Histogram Fit?**
+
+**Is it possible to combine poisson errors and gaussian errors?**
+
+**How does kafe2 handle uncertainties in a Histogram Fit?**
+
+The statistical uncertainties in a Histogram Fit are infered from the model function, so statistical erors
+don't have to be specified in the Container. If systematical uncertainties are added, this can be done
+via the usual :py:meth:`add_error` method. Since these uncertainties are assuemd to be gaussian distributed, the
+cost function of the Fit now has to be the "gaussian-approximation". This makes it possible to
+(approximately) combine Poisson and Gauss uncertainties.
+
+
+**What do I have to watch out for when I use a large number of datapoints?**
+
+**Why do systematics become important with large datasets?**
+
+**How does the dataset size influence statistical and systematic uncertatinties?**
+
+To reduce computing time, it can be useful to histogram the data and then use a Histogram fit. This reduces
+the number of effective datapoints in your fit.
+Furthermore, the larger the amount of data, the smaller are the statistical uncertainties (relatively).
+So when handling large amounts of data, you may come into a regime, where systematical errors, even though 
+they are small can play a significant role and influence the outcome of your fit. It can be useful to estimate the
+statistical uncertainties for a few bins and compare them to possible systematics.
+
+
+Plotting
+========
+
+**How can I customize the colors of my plot?**
+
+**Can I change the color of my fit?**
+
+**Is it possible to change the color or shape of my datapoints in a plot?**
+
+You can fully customize the appearance of your plots.
+This is described in detail in the plotting section of the:ref:`User Guide <_user_guide>`. In general the :py:meth:`~.Plot.customize`
+method can be used to customize the plotstyle of the datapoints, the data errorbars the model line
+and the model errorband. Besides the color, also the shape of markers or the label of an object can be changed.
+
+
+**Can I customize the axes of my plot?**
+
+**Can I change an axis label of my plot?**
+
+**How to use a logarithmic scale in kafe2?**
+
+The axis labels of a plot can be manually set using the :py:meth:`~.Plot.x_label` and :py:meth:`~.Plot.y_label` methods.
+Furthermore it is also possible to rescale an axis (e.g. as logarithmic scale) and to change the plot range using the :py:meth:`~.Plot.x_scale`
+and :py:meth:`~.Plot.x_range` methods. Examples can be found in the plotting section of the :ref:`User Guide <_user_guide>`
+
+
+Interpretation
+==============
+
+**What exactly does the :math:`\chi^2`-probability in the fit result tell me?**
+
+**What is a p-value?**
+
+The :math:`\chi^2` value is essentially a p-value. It tells you how likely it is, considering the model you used for the fit is true,
+how likely it is to get this :math:`\chi^2`-value or a larger one. In other words it tells you: If my model is correct how likely is
+it to get data tis incompatible or worse with my model? If the value is exactly 1, you might overestimated
+your uncertainties. If the value is 0 your model could be wrong. This metric can be used to compare how good different 
+models describe the observed data. For more information view the Hypothesis testing section of the
+:ref:`Mathematical Foundations <mathematical_foundations>`.
\ No newline at end of file