Intelligent Personal Assistants (IPAs) such as Apple’s Siri, Google Now, and Amazon Alexa are becoming an increasingly important class of web-service application. In contrast to keyword-oriented web search, IPAs provide a rich query interface that enables user interaction through images, audio, and natural language queries. However, supporting this interface involves compute-intensive machine-learning inference. To achieve acceptable performance, ML-driven IPAs increasingly depend on specialized hardware accelerators (eg GPUs, FPGAs or TPUs), increasing costs for IPA service providers. For end-users, IPAs also present considerable privacy risks given the sensitive nature of the data they capture. In this paper, we present Privacy Preserving Intelligent Personal Assistant at the EdGE (PAIGE), a hybrid edge-cloud architecture for privacy-preserving Intelligent Personal Assistants. PAIGE’s design is founded on the assumption that recent advances in low-cost hardware for machine-learning inference offer an opportunity to offload compute-intensive IPA ML tasks to the network edge. To allow privacy-preserving access to large IPA databases for less compute-intensive pre-processed queries, PAIGE leverages trusted execution environments at the server side. PAIGE’s hybrid design allows privacy-preserving hardware acceleration of compute-intensive tasks, while avoiding the need to move potentially large IPA question-answering databases to the edge. As a step towards realising PAIGE, we present a first systematic performance evaluation of existing edge accelerator hardware platforms for a subset of IPA workloads, and show they offer a competitive alternative to existing datacenter alternatives.