© 2019-present The original authors.
Note
|
Copies of this document may be made for your own use and for distribution to others, provided that you do not charge any fee for such copies and further provided that each copy contains this Copyright Notice, whether distributed in print or electronically. |
Preface
Project metadata
-
Version control - http://github.com/paulcwarren/spring-content/
-
Bugtracker - http://github.com/paulcwarren/spring-content/issues
-
Release repository - https://repo1.maven.org/maven2/
-
Snapshots repository - https://oss.sonatype.org/content/repositories/snapshots
1. Working with Spring Stores
The goal of the Spring Content is to make it easy to create applications that manage content such as documents, images and video by significantly reducing the amount of boilerplate code that the Developer must create for themselves. Instead, the Developer provides interfaces only that declare the intent for the content-related functionality. Based on these, and on class-path dependencies, Spring Content is then able to inject storage-specific implementations.
Important
|
This chapter explains the core concepts and interfaces for Spring Content. The examples in this chapter use Java configuration and code samples for the Spring Content S3 module. Adapt the Java configuration and the types to be extended to the equivalents for the particular Spring Content module that you are using. |
1.1. Core concepts
The central interfaces in the Spring Content are Store
, AssociativeStore
and ContentStore
. These interfaces
provide access to content streams through the standard Spring Resource API either directly or through association with
Spring Data entities.
1.1.1. Store
The simplest interface is the Store
interface. Essentially, it is a Spring ResourceLoader
that returns instances Spring Resource
. It is also generic allowing the Resource’s ID (or location) to be specified. All other Store interfaces extend from Store
.
public interface Store<SID extends Serializable> {
Resource getResource(SID id); (1)
}
-
Returns a Resource handle for the specified
id
For example, given a PictureStore
that extends Store
it is possible to store (retrieve and delete) pictures.
1.1.2. AssociativeStore
AssociativeStore
extends from Store
allowing Spring Resource’s to be associated with JPA Entities.
public interface AssociativeStore<SID extends Serializable> {
Resource getResource(SID id); (1)
void associate(S entity, PropertyPath path, SID id); (2)
void unassociate(S entity, PropertyPath path); (3)
Resource getResource(S entity, PropertyPath path); (4)
}
-
Returns a Resource handle for the specified
id
-
Associates the Resource
id
with the Entityentity
at the PropertyPathpath
-
Unassociates the Resource at the PropertyPath
path
from the entity -
Returns a handle for the associated Resource at PropertyPath
path
For example, given an Entity User
with Spring Content annotations, a UserRepository
and the PictureStore
this time extending AssociativeStore
it is possible to store and associate a profile picture for each user.
@Entity
@Data
public class User {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String username;
@ContentId
private String profilePictureId;
@ContentLength
private Long profilePictureLength
}
public interface UserRepository extends JpaRepository<User, Long> {
}
public interface PictureStore extends AssociativeStore<User, String> {
}
@SpringBootApplication
public class Application {
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
@Bean
public CommandLineRunner demo(UserRespository repo, PictureStore store) {
return (args) -> {
// create a new user
User jbauer = new User("jbauer");
// store a picture
WritableResource r = (WritableResource)store.getResource("/some/picture.jpeg");
try (InputStream is = new FileInputStream("/tmp/jbauer.jpg")) {
try (OutputStream os = ((WritableResource)r).getOutputStream()) {
IOUtils.copyLarge(is, os);
}
}
// associate the Resource with the Entity
store.associate(jbauer, PropertyPath.from("profilePicture"), "/some/picture.jpeg");
// save the user
repository.save(jbauer);
};
}
}
1.2. ContentStore
ContentStore
extends AssociativeStore and provides a more convenient API for managing associated content based on java Stream
, rather than Resource
.
public interface ContentStore<E, CID extends Serializable> {
void setContent(E entity, InputStream content); (1)
InputStream getContent(E entity); (2)
void unsetContent(E entity); (3)
}
-
Stores content and associates it with
entity
-
Returns the content associated with
entity
-
Deletes content and unassociates it from
entity
The example above can be refactored as follows:
@Entity
@Data
public class User {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String username;
@ContentId
private String profilePictureId;
@ContentLength
private Long profilePictureLength
}
public interface UserRepository extends JpaRepository<User, Long> {
}
public interface ProfilePictureStore extends ContentStore<User, String> {
}
@SpringBootApplication
public class Application {
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
@Bean
public CommandLineRunner demo(UserRepository repository, ProfilePictureStore store) {
return (args) -> {
// create a new user
User jbauer = new User("jbauer");
// store profile picture
store.setContent(jbauer, PropertyPath.from("profilePicture"), new FileInputStream("/tmp/jbauer.jpg"));
// save the user
repository.save(jbauer);
};
}
}
1.3. ReactiveContentStore
ReactiveContentStore
is an experimental Store that provides a reactive API for managing associated content based on
Mono and Flux reactive API.
public interface ReactiveContentStore<E, CID extends Serializable> {
Mono<S> setContent(S entity, PropertyPath path, long contentLen, Flux<ByteBuffer> buffer); (1)
Flux<ByteBuffer> getContentAsFlux(S entity, PropertyPath path); (2)
Mono<E> unsetContent(E entity); (3)
}
-
Stores content and associates it with
entity
-
Returns the content associated with
entity
-
Deletes content and unassociates it from
entity
The example above can be refactored as follows:
@Entity
@Data
public class User {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String username;
@ContentId
private String profilePictureId;
@ContentLength
private Long profilePictureLength
}
public interface UserRepository extends JpaRepository<User, Long> {
}
public interface ProfilePictureStore extends ReactiveContentStore<User, String> {
}
@SpringBootApplication
public class Application {
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
@Bean
public CommandLineRunner demo(UserRepository repository, ProfilePictureStore store) {
return (args) -> {
// create a new user
User jbauer = new User("jbauer");
// store profile picture
FileInputStream fis = new FileInputStream("/tmp/jbauer.jpg");
int len = fis.available();
ByteBuffer byteBuffer = ByteBuffer.allocate(len);
Channels.newChannel(fis).read(byteBuffer);
store.setContent(jbauer, PropertyPath.from("profilePicture"), len, Flux.just(byteBuffer)))
.doOnSuccess(updatedJbauer -> {
// save the user
repository.save(updatedJbauer).block(Duration.ofSeconds(10));
}).block(Duration.ofSeconds(10));
};
}
}
Currently, S3 is the only storage module that supports this experimental API.
1.4. Content Properties
As we can see above content is "associated" by adding additional metadata about the content to the Entity. This additional metadata is annotated with Spring Content annotations. There are several. The only mandatory annotation is @ContentId
. Other optional annotations include @ContentLength
, @MimeType
and @OriginalFileName
. These may be added to your entities when you need to capture this additional infomation about your associated content.
When adding these optional annotations it is highly recommended that you correlate the field’s name creating a "content property". This allows for multiple pieces of content to be associated with the same entity, as shown in the following example. When associating a single piece of content this is not necessary but still recommended.
@Entity
@Data
public class User {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String username;
@ContentId
private String profilePictureId; (1)
@ContentLength
private Long profilePictureLength
@MimeType
private String profilePictureType;
@OriginalFileName
private String profilePictureName;
@ContentId
private String avatarId; (2)
@ContentLength
private Long avatarLength
@MimeType
private String avatarType;
}
-
Content property "profilePicture" with id, length, type and original filename
-
Content property "avatar" with id, length and type
When modeled thus these can then be managed as follows:
InputStream profilePicture = store.getContent(user, PropertyPath.from("profilePicture"));
store.setContent(user, PropertyPath.from("avatar"), avatarStream);
1.4.1. Nested Content Properties
If desired content properties can also be nested, as the following JPA example shows:
@Entity
@Data
public class User {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
private String username;
private @Embedded Images images = new Images();
}
@Embeddable
public class Images {
@ContentId
private String profilePictureId;
@ContentLength
private Long profilePictureLength
@MimeType
private String profilePictureType;
@OriginalFileName
private String profileName;
@ContentId
private String avatarId;
@ContentLength
private Long avatarLength
@MimeType
private String avatarType;
}
These can then be managed with forward slash (/
) separated property paths:
InputStream profilePicture = store.getContent(user, PropertyPath.from("images/profilePicture"));
store.setContent(user, PropertyPath.from("images/avatar"), avatarStream);
1.5. Using Stores with Multiple Spring Content Storage Modules
Using a single Spring Content storage module in your application keeps things simple because all Storage beans will use to that one Spring Content storage module as their implementation. Sometimes, applications require more than one Spring Content storage module. In such cases, a store definition must distinguish between storage technologies by extending one of the module-specific signature Store interfaces.
See Signature Types for the signature types for the storage modules you are using.
1.5.1. Manual Storage Override
Because Spring Content provides an abstraction over storage it is also common to use one storage module for testing but another
for production. For these cases it is possible to again include multiple Spring Content storage modules,
but use generic Store interfaces, rather than signature types, and instead specify the spring.content.storage.type.default=<storage_module_id>
property to manually set the storage implementation to be injected into your Storage beans.
1.6. Events
Spring Content emits twelve events. Roughly speaking one for each Store method. They are:
-
BeforeGetResourceEvent
-
AfterGetResourceEvent
-
BeforeAssociateEvent
-
AfterAssociateEvent
-
BeforeUnassociateEvent
-
AfterUnassociateEvent
-
BeforeSetContent
-
AfterSetContent
-
BeforeGetContent
-
AfterGetContent
-
BeforeUnsetContent
-
AfterUnsetContent
1.6.1. Writing an ApplicationListener
If you wish to extend Spring Content’s functionality you can subclass the abstract class AbstractStoreEventListener
and
override the methods that you are interested in. When these events occur your handlers will be called.
There are two variants of each event handler. The first takes the entity with with the content is associated and is the source of the event. The second takes the event object. The latter can be useful, especially for events related to Store methods that return results to the caller.
public class ExampleEventListener extends AbstractStoreEventListener {
@Override
public void onAfterSetContent(Object entity) {
...logic to inspect and handle the entity and it's content after it is stored
}
@Override
public void onBeforeGetContent(BeforeGetContentEvent event) {
...logic to inspect and handle the entity and it's content before it is fetched
}
}
The down-side of this approach is that it does not filter events based on Entity.
1.6.2. Writing an Annotated StoreEventHandler
Another approach is to use an annotated handler, which does filter events based on Entity.
To declare a handler, create a POJO and annotate it as @StoreEventHandler
. This tells
Spring Content that this class needs to be inspected for handler methods. It
iterates over the class’s methods and looks for annotations that correspond to the
event. There are twelve handler annotations:
-
HandleBeforeGetResource
-
HandleAfterGetResource
-
HandleBeforeAssociate
-
HandleAfterAssociate
-
HandleBeforeUnassociate
-
HandleAfterUnassociate
-
HandleBeforeSetContent
-
HandleAfterSetContent
-
HandleBeforeGetContent
-
HandleAfterGetContent
-
HandleBeforeUnsetContent
-
HandleAfterUnsetContent
@StoreEventHandler
public class ExampleAnnotatedEventListener {
@HandleAfterSetContent
public void handleAfterSetContent(SopDocument doc) {
...type-safe handling logic for SopDocument's and their content after it is stored
}
@HandleBeforeGetContent
public void onBeforeGetContent(Product product) {
...type-safe handling logic for Product's and their content before it is fetched
}
}
These handlers will be called only when the event originates from a matching entity.
As with the ApplicationListener event handler in some cases it is useful to handle the event. For example, when Store methods returns results to the caller.
@StoreEventHandler
public class ExampleAnnotatedEventListener {
@HandleAfterSetContent
public void handleAfterGetResource(AfterGetResourceEvent event) {
SopDocument doc = event.getSource();
Resource resourceToBeReturned = event.getResult();
...code that manipulates the resource being returned...
}
}
To register your event handler, either mark the class with one of Spring’s @Component stereotypes so it can be picked up by @SpringBootApplication or @ComponentScan. Or declare an instance of your annotated bean in your ApplicationContext.
@Configuration
public class ContentStoreConfiguration {
@Bean
ExampeAnnotatedEventHandler exampleEventHandler() {
return new ExampeAnnotatedEventHandler();
}
}
1.7. Searchable Stores
Applications that handle documents and other media usually have search capabilities allowing relevant content to be found by looking inside of it for keywords or phrases, so called full-text search.
Spring Content is able to support this capability with it’s Searchable<CID>
interface.
public interface Searchable<CID> {
Iterable<T> search(String queryString);
}
Any Store interface can be made to extend Searchable<CID>
in order to extend its capabilities to include the
search(String queryString)
method. For example:
public interface DocumentContentStore extends ContentStore<Document, UUID>, Searchable<UUID> {
}
...
@Autowired
private DocumentContentStore store;
Iterable<UUID> = store.search("to be or not to be");
For search
to return actual results full-text indexing must be enabled. See Fulltext Indexing and Searching
for more information on how to do this.
1.8. Renderable Stores
Applications that handle files and other media usually also have rendition capabilities allowing content to be transformed from one format to another.
Content stores can therefore optionally also be given rendition capabilities by extending the Renderable<E>
interface.
public interface Renderable<E> {
InputStream getRendition(E entity, String mimeType);
}
Returns a mimeType
rendition of the content associated with entity
.
Renditions must be enabled and renderers provided. See Renditions for more information on how to do this.
1.9. Error Translation
When using Stores, you must decide how to handle the storage technology’s native exception classes. Typically, storage layers throw runtime exceptions and do not have to be declared or caught. You may also have to deal with IllegalArgumentException
and IllegalStateException
. This means that callers can only treat exceptions as being generally fatal, unless they want to depend on the storage technology’s own exception structure. This trade-off might be acceptable to applications that are strongly aligned to a particular storage or do not need any special exception treatment (or both). However, Spring Content lets exception translation be applied transparently through the @Store annotations. The following examples show how to contribute a bean that implements StoreExceptionTranslator
that translates RuntimeException’s to StoreAccessExceptions:
@Configuration
public class Config {
@Bean
public StoreExceptionTranslator translator() {
return new StoreExceptionTranslator() {
@Override
public StoreAccessException translate(RuntimeException re) {
...
}
};
}
InputStream getRendition(E entity, String mimeType);
}
1.10. Creating Content Store Instances
To use these core concepts:
-
Define a Spring Data entity and give it’s instances the ability to be associated with content by adding
@ContentId
and@ContentLength
annotations@Entity public class SopDocument { private @Id @GeneratedValue Long id; private String title; private String[] authors, keywords; // Spring Content managed attribute private @ContentId UUID contentId; private @ContentLength Long contentLen; }
-
Define an interface extending Spring Data’s
CrudRepository
and type it to the domain and ID classes.public interface SopDocumentRepository extends CrudRepository<SopDocument, Long> { }
-
Define another interface extending
ContentStore
and type it to the domain and@ContentId
class.public interface SopDocumentContentStore extends ContentStore<SopDocument, UUID> { }
-
Optionally, make it extend
Searchable
public interface SopDocumentContentStore extends ContentStore<SopDocument, UUID>, Searchable<UUID> { }
-
Optionally, make it extend
Renderable
public interface SopDocumentContentStore extends ContentStore<SopDocument, UUID>, Renderable<SopDocument> { }
-
Set up Spring to create proxy instances for these two interfaces using JavaConfig:
@EnableJpaRepositories @EnableS3Stores class Config {}
NoteThe JPA and S3 namespaces are used in this example. If you are using the repository and content store abstractions for other databases and stores, you need to change this to the appropriate namespace declaration for your store module. -
Inject the repositories and use them
@Component public class SomeClass { @Autowired private SopDocumentRepository repo; @Autowired private SopDocumentContentStore contentStore; public void doSomething() { SopDocument doc = new SopDocument(); doc.setTitle("example"); contentStore.setContent(doc, new ByteArrayInputStream("some interesting content".getBytes())); # (1) doc.save(); ... InputStream content = contentStore.getContent(sopDocument); ... List<SopDocument> docs = doc.findAllByContentId(contentStore.findKeyword("interesting")); ... } }
-
Spring Content will update the
@ContentId
and@ContentLength
fields
-
2. Fulltext Indexing and Searching with Elasticsearch
2.1. Overview
When enabled, the Elasticsearch integration will, by default, forward all content to an Elasticsearch cluster for fulltext indexing.
2.2. Maven Central Coordinates
The maven coordinates for this Spring Content library are as follows:
<dependency>
<groupId>com.github.paulcwarren</groupId>
<artifactId>spring-content-elasticsearch</artifactId>
</dependency>
As it is usual to use several Spring Content libraries together importing the bom is recommended:
<dependency>
<groupId>com.github.paulcwarren</groupId>
<artifactId>spring-content-bom</artifactId>
<version>${spring-content-version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
2.3. Annotation-based Configuration
Spring Content Elasticsearch requires a RestHighLevelClient
bean that is used as the connection to your Elasticsearch
cluster.
Elasticsearch can be enabled with the following Java Config.
@Configuration
@EnableElasticsearchFulltextIndexing (1)
@EnableFilesystemStores (2)
public static class ApplicationConfig {
(3)
public RestHighLevelClient client() {
return new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
}
}
-
Specify the
@EnableElasticsearchFulltextIndexing
annotation in an@Configuration
class -
Spring Content Elasticsearch works with any Spring Content Store module
-
Ensure a
RestHighLevelClient
bean is instantiated somewhere within your@Configuration
2.4. Spring Boot Configuration
Alternatively, you can use the Spring Boot Starter spring-content-elasticsearch-boot-starter
.
When using this method of configuration the @EnableElasticsearchFulltextIndexing
annotation can be omitted as it will
be added for you. As will a RestHighLevelClient client bean configured to connect to localhost.
The following configuration properties (prefix spring.content.elasticsearch) are supported.
Property | Description |
---|---|
autoindex |
Whether, or not, to enable autoindexing to index content as it is added |
2.5. Making Stores Searchable
With fulltext-indexing enabled, Store interfaces can be made Searchable
. See
Searchable Stores for more information on how to do this.
2.6. Custom Indexing
By default when you @EnableElasticsearchFulltextIndexing
a store event handler is registered that intercepts content
being added to a Store and sends that content to your Elasticsearch cluster for full-text indexing. This is usually
all you need. However, sometimes you may need more control over when documents are indexed. For these cases you can
use the IndexService
bean directly in your code to index (or unindex) content as required.
When performing custom indexing it is usual to turn of the auto-indexing feature but specifying
spring.content.elasticsearch.autoindex=false
in your application properties.
2.7. Text Extraction
For images and other media, it also possible to configure the elasticsearch integration to perform text extraction and send that instead of the image content to Elasticsearch.
This requires two stages of configuration:
-
Add one or more renderers to the application context. These renderers are used to perform the text extraction. To be used for text extraction a renderer must produce
text/plain
content but can consume any suitable mime type. When content matching itsconsume
mime type is added to a Store the renderer will be invoked to extract text and this extracted text will then be sent to the Elasticsearch for fulltext indexing in place of the original content.
@Configuration
@EnableElasticsearchFulltextIndexing
@EnableFilesystemStores
public static class ApplicationConfig {
public RestHighLevelClient client() {
return new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
}
@Bean
public RenditionProvider jpgTextExtractor() {
return new RenditionProvider() {
@Override
public String consumes() {
return "image/jpg"; // can be any mime-type
}
@Override
public String[] produces() {
return new String[] {"text/plain"}; // must be 'text/plain'
}
@Override
public InputStream convert(InputStream fromInputSource, String toMimeType) {
...implementation...
}
}
}
}
-
Make the Store Renderable as this will be used internally to extract the text
public interface DocumentStore extends ContentStore<Document, UUID> implements Searchable<Document>, Renderable<Document> {
}
2.8. Custom Attributes and Filtering Queries
By default Spring Content Elasticsearch indexes content only. However, it is common to synchronize additional attributes from the primary domain model that can then be used for filtering full-text queries or for efficiently populating search results (removing the need to perform subsequent queries against the primary domain model).
To synchronize additional attributes when content is indexed add a bean that implements AttributeProvider
to your
application’s configuration:
@Bean
public AttributeProvider<Document> attributeProvider() {
return new AttributeProvider<Document>() {
@Override
public Map<String, String> synchronize(Document entity) {
Map<String, String> attrs = new HashMap<>();
attrs.put("title", entity.getTitle());
attrs.put("author", entity.getAuthor());
return attrs;
}
};
}
To customize the query that gets executed when a Store’s Searchable method is invoked add a FilterQueryProvider
bean to your
application’s configuration:
@Bean
public FilterQueryProvider fqProvider() {
return new FilterQueryProvider() {
@Override
public String[] filterQueries(Class<?> entity) {
return new String[] {"author:foo@bar.com"};
}
};
}
Note
|
this bean is often a request scoped bean or has an implementation based on a thread local variable in order to build and return filter queries based on the current execution context. |
2.9. Search Return Types
Searchable
is a generic type allowing you to specify the return type of the result set. The simplest option is to
type this interface to String in which case result sets will be collections of content IDs.
You can also type the interface to your own custom class. Several annotations are available allowing you to tailor full-text search results to your specific needs:
-
@ContentId; extracts the content ID of the content from your search results
-
@Highlight; extracts highlighted snippets from your search results so you can show users where the query matches are
-
Attribute; extracts the specified attribute from your search results (must be synchronized using an
AttributeProvider
) :leveloffset: -1