faq docs github


Why use fastmap?

As a data scientist: Most of your time is spent pre-processing data and managing servers. This wasn't what you were hired for. You were hired to analyze data and generate insights. Use fastmap to spend less time fighting infrastructure and more time doing things that matter.

As a backend engineer: Parallelizing code on the server is an essential yet non-trivial task. Instead of provisioning microservices, auditing security, and building release pipelines, why not use something that requires little-to-no provisioning, works the same locally as it does in the cloud, and is simple enough for a summer intern to setup?

As an academic: You spend too much of your time reading OpenMP documentation and not enough time developing and simulating models. What matters is getting your research done - not configuring technology.

When is fastmap appropriate to use?

Fastmap is best when map is too slow but setting up a Spark cluster or deploying Lambdas would be overkill.

As a rule-of-thumb, fastmap will speed up any call to map that would have otherwise taken more than one second. This is possible because, by default, fastmap algorithmically distributes work between local and cloud resources.

If in doubt, try running fastmap with a small test dataset. Fastmap is transparent and will inform you when using it has made your code slower.

What versions of Python are available?

Fastmap works with Python 3.6+. Python 2 is not supported.

What sort of functions can fastmap process?

Fastmap can handle almost any Python function with any number of dependencies. There are three important restictions though:

  • Stateful functions: The function should not use or alter mutable global state. In other words, it should be a pure function.
  • Multiprocessing: Fastmap functions cannot use the multiprocessing module internally.
  • Compiled Python extensions: Fastmap can run most modules written in pure Python as well as most third-party modules regardless of language. However, local non-Python modules (e.g. compiled C, Fortran, Rust, etc.) are not supported.

What sort of iterables can fastmap process?

  • Python sequences (lists, tuples, etc.)
  • Python generators
Fastmap cannot currently process Numpy ndarrays or Pandas dataframes but both are on the roadmap.

How does fastmap compare to Spark / Hadoop?

Spark and Hadoop have dozens of features that you may or may not need. Setting up these frameworks as clusters can be time-consuming and expensive. By contrast, fastmap does exactly one thing - makes it easy to parallelize arbitrary code in the cloud.

How does fastmap compare to Lambda / Cloud functions?

For many common applications, fastmap can be a substitute for AWS Lambda and similar technologies. Fastmap's key advantage is that code is deployed at runtime. This means no need to manage separate repos, no infrastructure configuration, and no coordinated deploys. Also, because fastmap supports local execution, you can iterate faster and be more confident in your production environment.

How does fastmap compare to Ray / Dask?

Ray & Dask are the two frameworks that are most similar to fastmap. Fastmap strives to be simpler - especially when deploying. In contrast to Ray and Dask, fastmap has a single-command GCP deployment script.

Are there memory limits?

Yes. With the standard deployable cloud service, you are limited to 1GB of memory. In practice, your code's actual memory limit might be lower or higher.

How do the execution policies work?

Fastmap has three execution policies: LOCAL, CLOUD, and ADAPTIVE. These allow you to run your code in different environments. Running adaptively will generally be the fastest. For more, see the ExecPolicy docs.