Here is an extract of the introduction to a (draft) paper making some of the points in more detail:
The subject of Topological Data Analysis (TDA) has emerged recently as being that part of Computational Topology concerned with applying the methods of that subject to the analysis of data sets, that are often of very large size; the methods used are adapted from algebraic and differential topology and are are closely related to those used for spatial reconstruction from scanned data in Visualisation, but the context is, theoretically, not limited to low dimensions nor to data of spatial origin nor, initially, to the visualisation of the data. Its aim, rather, is to give qualitative information on the data, allowing for statistical variation, noise etc. This, if that view is correct, puts it in a strange position. The invariants it puts forward will, if they are to be successful, have to give useful information about the data. The theory does require some hard work by the user to interpret the information it is giving in a usable way, but until it is interpreted, how can it be useful? It is probable that one can extend the methods from those currently being used, but also there should be an examination what new information can be obtained by using those same methods in new ways.
The sort of applications considered so far have been looking, for instance, for qualitative structure in the clusters obtained by some classifier. These methods usually assume the data comes from sampling a manifold or simplicial complex. To these we would suggest the addition of a new type of analysis related to the verification of a mathematical model for what theoretically might be a non-linear situation involving feedback and perhaps even some chaotic aspects. The idealised mathematical model would then be likely to predict that the experimental data might be as if sampled from a (possibly high dimensional) space that is embedded in some (even higher dimensional) Euclidean space, but this idealised model space need not form a manifold, and might even be fractal in its nature.
It has been argued that there are two related views of Spatial Representation. The first is representation of spaces, but the second, and for the purposes here the more basic one, is representation by spaces. In both cases, the space is an idealised object obtained as the limiting case of indefinitely refined observations of the context object or data. A mathematical model, if possible, will give a second idealised object, another `space'. As this second idealised space is typically observable only through finite approximations, it also is obtained as a limit. One eventual aim of TDA in this situation could be to provide a comparison between these two `spaces'. In other words, in this analysis, any space gives an idealisation of a context and whether or not that is a `spatial context', or even what `spatial context' means might be the subject of a lot of philosophical debate and we will not explore it more here.
The classical methods of algebraic topology had quite a lot to say about the `invariants' of such limiting spaces, and we will look briefly at some of these methods later. They, by themselves, are not algorithmically efficient, and sometimes not even feasible, but they can be adapted to give much more computationally `friendly' tools. We will briefly review the history of these tools to show how, on a limiting space, the information on the (finite) approximations relates to that on the limit. Here some examples show relevant behaviour for our `non-linear model' thought experiment. The critical phenomena in such examples relates closely to the analysis of not so much the approximations to the limit, but to the refinement or comparison maps between them. This means that we must examine whether the available tools (in particular the various forms of homology) of the present form of TDA can be pushed beyond their current analysis of objects to help in the analysis of maps, and, in the limit, to give qualitative information on the idealised limiting space.